1
00:00:00,000 --> 00:00:00,400


2
00:00:00,400 --> 00:00:06,700
Hello, and welcome to a video on calculating odds in risk ratios in Python. Today, we won't be loading in any data.

3
00:00:06,700 --> 00:00:06,733


4
00:00:06,733 --> 00:00:12,866
So let's start with loading our libraries and get right to it. You can see from scypy.stats, we're importing Fisher exact

5
00:00:12,866 --> 00:00:19,066
and we're importing NumPy as NP, so we'll go ahead and load those in. First, let's calculate

6
00:00:19,066 --> 00:00:19,099


7
00:00:19,100 --> 00:00:25,500
the odds and risk ratio manually. Let's use some variables from an example problem. Let's say we have a contingency

8
00:00:25,500 --> 00:00:25,533


9
00:00:25,533 --> 00:00:31,599
table like this, where A equals 40, B equals 60, C equals 30, and D equals

10
00:00:31,600 --> 00:00:38,033
70. You can see we're using a new notation to put the information into these variables this time.

11
00:00:38,033 --> 00:00:39,966


12
00:00:39,966 --> 00:00:46,166
Now first, let's say we want to calculate our odds ratio. We could do this by dividing

13
00:00:46,166 --> 00:00:52,166
A by B over C divided by D. And if you need a visual representation of what that looks like,

14
00:00:52,166 --> 00:00:52,199


15
00:00:52,200 --> 00:00:58,500
I have one here. So A divided

16
00:00:58,500 --> 00:01:04,800
by B over C divided by D. A, B, C, and

17
00:01:04,800 --> 00:01:05,000


18
00:01:05,000 --> 00:01:11,000
D are arranged in a two by two contingency table. And sometimes seeing this makes

19
00:01:11,000 --> 00:01:13,700
it helpful to visualize what we're doing with the math.

20
00:01:13,700 --> 00:01:17,900


21
00:01:17,900 --> 00:01:23,900
We can also find our risk ratio by dividing by the risks, dividing two risks

22
00:01:23,900 --> 00:01:30,033
together, basically. We'll go back over this in a moment below. But for right now, let's

23
00:01:30,033 --> 00:01:30,066


24
00:01:30,066 --> 00:01:35,932
run this code. And you can see we found our odds ratio, which is 1.5 repeating and

25
00:01:35,933 --> 00:01:36,499


26
00:01:36,500 --> 00:01:42,666
our risk ratio, which is 1.3 almost repeating. Now, this will

27
00:01:42,666 --> 00:01:48,799
work the same way if we define each variable separately. We're 40, 60, 30, and 70 each get put

28
00:01:48,800 --> 00:01:54,933
into their own initialized variable. We can find the

29
00:01:54,933 --> 00:02:01,066
odds ratio in a different way this time with a times d divided by b times c.

30
00:02:01,066 --> 00:02:01,332


31
00:02:01,333 --> 00:02:07,133
So a times d divided by b times c will give you the same result

32
00:02:07,133 --> 00:02:07,633


33
00:02:07,633 --> 00:02:13,666
as what we did above. We can also find our risk ratio

34
00:02:13,666 --> 00:02:19,699
by dividing by two risks. So we'll take a divided by

35
00:02:19,700 --> 00:02:19,733


36
00:02:19,733 --> 00:02:26,066
a plus b, because a plus b is the marginal total

37
00:02:26,066 --> 00:02:27,566


38
00:02:27,566 --> 00:02:33,966
of this 40 and 60. C plus d would be the marginal total down here. When it click calculate

39
00:02:33,966 --> 00:02:39,132
here and you can see now, we've got our marginal totals out here,

40
00:02:39,133 --> 00:02:40,766


41
00:02:40,766 --> 00:02:45,466
170 and 130. So this would be 40

42
00:02:45,466 --> 00:02:48,066


43
00:02:48,066 --> 00:02:54,432
divided by 100. Is the risk ratio for successes for unemployed or for employed people

44
00:02:54,433 --> 00:03:00,533
in this case. Let's say that this is a table for success and interviews if you're employed versus

45
00:03:00,533 --> 00:03:01,366
unemployed.

46
00:03:01,366 --> 00:03:09,966


47
00:03:09,966 --> 00:03:15,699
Okay, and then we have risk of C, which is the same thing, which is this

48
00:03:15,700 --> 00:03:16,066


49
00:03:16,066 --> 00:03:21,766
divided by, or yes, divided by 30 plus 70, which is 30

50
00:03:21,766 --> 00:03:22,166


51
00:03:22,166 --> 00:03:28,232
divided by 100. And then we

52
00:03:28,233 --> 00:03:34,499
can find the risk ratio of those two risks by dividing risk of A divided by risk of C. You can also

53
00:03:34,500 --> 00:03:34,533


54
00:03:34,533 --> 00:03:40,533
do this all in one step, which you can see is just combining these two into one line. And

55
00:03:40,533 --> 00:03:46,533
when I run that, we can see they both turn out exactly the same result.

56
00:03:46,533 --> 00:03:47,233


57
00:03:47,233 --> 00:03:53,433
And it is the same result we got above as well. In Python, there's

58
00:03:53,433 --> 00:03:53,466


59
00:03:53,466 --> 00:03:59,699
also a function from the scipy.stats package that we perform this automatically, given an array. An array

60
00:03:59,700 --> 00:03:59,733


61
00:03:59,733 --> 00:04:05,833
in Python is like a list, except it can only store values at the same data type. So if we put numbers into it, they

62
00:04:05,833 --> 00:04:11,966
have to be all numbers. Okay, now the same

63
00:04:11,966 --> 00:04:11,999


64
00:04:12,000 --> 00:04:17,833
is above where we had a, b, and we gave it some information.

65
00:04:17,833 --> 00:04:18,866


66
00:04:18,866 --> 00:04:24,932
This time, we're saying odds ratio, comma, p value, is equal to fischer

67
00:04:24,933 --> 00:04:31,133
exact table. And our table is right here. The reason we're able

68
00:04:31,133 --> 00:04:36,766
to do this is because we know fischer exact, returns two items from it.

69
00:04:36,766 --> 00:04:43,132


70
00:04:43,133 --> 00:04:48,966
So the odds ratio from fischer's test is 1.56. Yep, that's what we were expecting from above.

71
00:04:48,966 --> 00:04:49,299


72
00:04:49,300 --> 00:04:54,866
There is no quick way to do a risk ratio like this, so you will just have to do it by hand in Python, unfortunately.

73
00:04:54,866 --> 00:04:55,299


74
00:04:55,300 --> 00:05:01,466
But if you'd like to use this method to find your odds ratio, you are more than welcome to. All

75
00:05:01,466 --> 00:05:02,532
right, happy coding.