Test of Independence Study Material

Let us start the topic by reviewing the concept of independent events. After all, as the name implies, the “test of independence” tests whether two events are independent or not.

  1. Independent Events

We have previously defined two events A and B are independent if P(A) = P(A|B), that is, the chance of A is the same with the chance of A given B. The example we used involving calculation of P(red card) and P(odd number card). We found out that P(odd number card) = P(odd number card | red card). The probability of picking an odd number card is the same regardless you are picking it 1) from an entire deck of cards – P(odd number card) or 2) from red card from the deck – P(odd number card | red card).

You can test the conclusion by using the equation.

  • P(odd number card) = 28/52
  • P(odd number card | red card) = 14/26

Table 1. Cross-tabulation of Even/Odd Number and Color of Poker Cards

 

Odd number

Even number

Row Total

Red

14

12

26

Black

14

12

26

Column Total

28

24

52

 

When we say two events are independent, it means the marginal probabilities (i.e., the probability of the event by itself) of two events do not interfere with each other.

In test of independence, we determine whether two categorical variables are independent by summarizing them into a cross-tabulation format. This statement mentions several boundaries and conditions attached to test of independence:

  1. At the current stage, we will only conduct tests that involves two dimensions (e.g., black or red and odd or even) or events;

  1. The variables involved will be categorical data;
  2. You need to summarize your sample data in the cross-tabulation format.

Please see below for a snippet of sample dataset that is ideal for this type of test.


Figure 1. A snippet of sample data

The sample data involves two events (or dimensions), office and Yes/No. “Office” has three values: Office 1, Office 2, and Office 3. Yes/No event has two values: Yes and No. Therefore, the cross-tabulation for this data can take following form.

 

Office 1

Office 2

Office 3

Total Yes/No

Yes

x1-yes

x2-yes

x3-yes

\Sigmaxyes

No

x1-no

x2-no

x3-no

\Sigmaxno

Total Office

\Sigmax1

\Sigmax2

\Sigmax3

\Sigmax

 

(* You will be provided with a data file that contains two or more variables that can be used to conduct a test of independence. Please refer later section for how to use Pivot table to make a cross-tabulation like the one above.)

  1. Expected Form of Cross-Tabulation when Two Events (dimensions) are Independent

There are some books saying that we are testing independence of “two variables”. I think it is confusing because sometimes there are more than two variables (i.e., like the previous “ideal” dataset sample) involved. It is really two events or two dimensions that are under consideration for testing.

Following table is intended to show whether two events – degree and income are independent. x represents frequency of each group out of four combinations. We know what the column and row totals are, but do not know specific x.

 

High Income

Low Income

Row Total

Graduate Degree

xg-high

xg-low

200

Undergraduate Degree

xu-high

xu-low

400

Column Total

240

360

600

 

WHEN TWO EVENTS ARE INDEPENDENT, it forms a certain pattern that even though we do not know specific values of each x, as long as we know the column and row total, we can correctly infer those x values. Please refer following calculation.

If degree and income are independent, based on the previously learned probability calculation, we have:

  • P(High Income) = P(High Income | Graduate Degree) = 240/600 = xg-high /200

  • xg-high = 80
  • We see the proportion of high income column total (240) over grand total (600) is the same with the graduate degree and high income count (xg-high = 80) over the row total of graduate degree (200). Also, the proportion of graduate degree column total (200) over grand total (600) is the same with the xg-high over the column total of High income.
  • You can find out the patterns for all remaining three x.

Do you sense (or notice) the pattern and do you understand why?

  • If income and degree have nothing to do with each other, the pattern (or the distribution) of each income group (low and high) within each degree (undergraduate and graduate) should not differ drastically.

Based on this expected pattern of cross-tabulation of two independent events, we can conduct the test of independence. In this test, instead of relying on previously used sample statistics (i.e., mean, standard deviation, and proportion), we use the frequency pattern of each group.

  1. Measuring the Differences between Expected and Observed Pattern using Chi-square

Here is the completed expected frequency shown in a cross-tabulation form when two events are independent.

 

High Income

Low Income

Row Total

Graduate Degree

80

120

200

Undergraduate Degree

160

240

400

Column Total

240

360

600

 

Following table shows the actual (or observed) frequency for each group. (Remarks: It is just an example I made up.)

 

High Income

Low Income

Row Total

Graduate Degree

160

40

200

Undergraduate Degree

80

320

400

Column Total

240

360

600

 

Let us now see how much deviations occurred in the observed frequency when compared to the expected frequency. To do so, I am using a test statistic called chi-square to measure the deviations.

Chi-square = (80-160)^2/160 + (120-40)^2/40 + (160-80)^2/80 + (240-320)^2/320 = 300

Posted in Data Analytics | Leave a comment

Astrill VPN (Client) App for Android 5+ (Lollipop)

I have found out that after my Smartphone updated to Android 5.0.1, the Astrill VPN app no longer works. It keeps returning an error message that says “No process running.”

I wrote an email to Astrill customer support, and they rely with a beta version of the app which worked fine in my Nexus 5, Android 5.1 system.

You can give it try from following link:

Link: https://getastrill.com/downloads/AstrillVpn-2.0-Beta5.apk

Posted in App of the Week, Internet and Cloud, Mobile | Tagged , , , | 1 Comment

Counterintuitive Mac OS X Series 1: How to Completely Delete VPN.

Mac OS X is a very awesome operating system. But sometimes it just is not as intuitive as what it appears to be. For example,

Do you know how to delete VPN connections completely?

You might want to try something in “System Preferences -> Network” Settings. But you will soon discover that the button for removal is greyed out, and you cannot press it.

Some in apple support community suggested following ways to remove them. If you have rather complicated VPN connection names, you’d better rename them before you delete them one by one using the command. (You can rename them in Network preference page to rename it.)

Screen Shot 2014-12-04 at 9.29.11 PM

Source: https://discussions.apple.com/thread/3828655?start=0&tstart=0

It is kind of working, except that it failed to show you how to remove one last VPN connection. The command he suggested cannot remove last one.

In order to delete entire VPN completely, you need to remove Profiles generated by installing VPN services. If you open up “System Preferences”, you will see a “Profiles” icon. Get in and delete whatever profiles associated with the VPN which you would like to delete, then VPN will be gone. You will not even need the aforementioned method.

Screen Shot 2014-12-04 at 9.34.56 PM

Screen Shot 2014-12-04 at 9.37.06 PM

Posted in Computer Tips, Mac OS X | Tagged , , , | Leave a comment

How to delete multiple articles at once at Kindle digital library

For some people, kindle has become more than an option, I use it daily to read articles, especially longer ones. With Amazon-provided “send to Kindle” extension in Chrome browser, sending documents to Kindle devices has become very useful.

SendToKindle

But Kindle digital library has a very poor management function, while you can delete one single article, but you cannot delete several articles at once. That is why you have gone a long way to here.

In Chrome browser, get to the page where you the list of documents you want to delete from Kindle digital library, press “ctrl + shit + j”, a developer’s tool will pop-up. Select the console tab, and paste following lines of code, and press enter.

Console

javascript:(function(){ var v = new RegExp("PersonalDocuments"); if (!v.test(document.URL)) { return false; } {a=document.getElementsByClassName('rowBodyCollapsed');for(var i = 0; i<a.length; i++){Fion.deleteItem('deleteItem_'+a[i].getAttribute('asin'));};return; }})();

Then you will see following screen that shows deleting action.

DeletingSource:

http://www.amazon.com/forum/kindle?_encoding=UTF8&cdForum=Fx1D7SY3BVSESG&cdThread=Tx2INMU6OCSGC2Z

Update: 

You might also want to try following website: http://massdestroykindleitems.com

Posted in Computer Tips, Internet and Cloud | Leave a comment