Basic of Probability Theory for Data Science

Basic of Probability Theory for Data Science

1. Basic Concepts for Probability

Random Experiment:

  • A trail that can have more than one possible outcome.
  • The trail should be replicable under fixed conditions
  • The outcome of the trail is unpredictable

Event : A specific outcome of a random experiment(e.g X=1)

Fundamental Event: The minimum grain of event defined according to the objective of the random experiment(Not possible or necessary to split into smaller grain). For example, for a throw of dice, the fundamental events would be face = 1,2,...6.

Compound Event: A event consists of multiple fundamental events(e. g, for a throw of dice: Face < 5)

Sample Space: A collection consists of all possible fundamental events. (e. g, for two flips of a coin )

Random Variable: A function that map each sample point in the sample space into a real number . strictly, the definition of a event is a collection , but in most real implementation, just understand event as an outcome.

2. Interpretation of Probability

Probability describe how likely a event would happen. In the probability theory, the following axiom are give:

Where is a definite event(containing all fundamental events), and are exclusive

2.1 Classical Model of Probability

In Classical Interpretation of probability, two assumptions are considered satisfied:

  1. The sample space contains finite fundamental events
  2. The happening of each fundamental event are equally likely

Under such assumptions, the probability of an event A can be defined as where is the number of fundamental events in the sample space, and is the number of fundamental events in event A.

For most classical probability case, and can be calculated from permutation and combination:

2.2 Geometric Model of Probability

Define a geometric measure of a event(e. g length of line segment, area)

2.3 Frequency and Statistical Probability

Suppose n times of random experiment are conducted, and event A happened m times, the define the frequency of event A as: Then the Statistical Probability of event A is: Note that probability is a inner properties of a random variable. Statistical Probability is a mathematic approximation of th real probability

3. Baisc Theroms in Probability Theory

4. Conditional Probability, Joint Probability and Independency

4.1 Conditional Probability

Let A, B be two events in sample space , the probability of the A given the condition that B has happened is called conditional probability, denoted as

The sample sapce of is B, not . Conditioning means "Compression" on the sample spcae. According to the axiom of probability theory, the probability of the whole sample space is 1. Thus, let A be the random variable X = a: A and B can be two events of a same variable, or each be a event for a separate variable.

4.2 Law of Total Probability

Let be a series of collectively exhaustive events, A be another event:

4.3 Joint Probability

Assume a two-dimension sample space is determined by two random experiment, which means we have two random variables X and Y for a sample space. Let A, B be a certain outcome of variable X and Y respectively, the probability taht events A and B both happen is called the joint probability of A and B, denoted as . The joint probability has the following properties: If X and Y are independent: For such a sample sapce: associated with the Law of Total Probability:

4.4 Bayesian Law

Let be a series of collectively exhaustive events, B be another event: where we call:

  • : hypothesis event, an event we want to attest its probability distribution through observations on evidence
  • B: evidence, an event used to update knowledge on the hypothesis event
  • : prior probability, representing the knowledge before the evidence emerge
  • : likehood, representing the probability of B under events A
  • : posterior probability, representing the updated knowledge after evidence emerge

Specific examples of bayesian inference can be referred via this article

4.5 Independency of Events

If the probability of A is not affected by whether event B happen, then A is independent to B. In conditional probability form:

Note that and cannot be both true

5. Probability Distribution & Probability Density Function

5.1 Discrete Random Variable and Probability Distribution

If the possible value of a random variable is countable, then it is a discrete random variable. The probability distribution of a discrete random variable X is defined as a function: The PDF has the following properties:

5.2 Continuous Random Variable and Probability Density

If a randome variable can be any value on a range and there exists a integratable function , so that: Then X is called a continuous random variable, is called the probability density function of X.

For a continuous random variable, the probability of each single sample point would be 0. Instead of an actual probability, the distribution of a continuous random variable is described by the probability density of each data point. The value of on a specific point represents the probability density of that sample point:

5.3 Distribution Type

For details about distribution type, refer to here

6. Expectation and Variance

6.1 Expectation

For discrete variable:

For discrete variable: if E[X] can converge

Properties of Expectation:

6.2 Variance

Properties of Variance:

6.3 Covariance

Covariance is a measure of the joint variability of two random variables, it represent thedegree that two variables variate samely in directions.

Properties of Variance:


Basic of Probability Theory for Data Science
http://example.com/2022/12/02/Basic-prob/
Author
Zhengyuan Yang
Posted on
December 2, 2022
Licensed under