Whats the difference between

probability density functionandprobability distribution function?

**Answer**

**Distribution Function**

- The probability distribution function / probability function has ambiguous definition. They may be referred to:
- Probability density function (PDF)
- Cumulative distribution function (CDF)
- or probability mass function (PMF) (statement from Wikipedia)

- But what confirm is:
- Discrete case: Probability Mass Function (PMF)
- Continuous case: Probability Density Function (PDF)
- Both cases: Cumulative distribution function (CDF)

- Probability at certain x value, P(X=x) can be directly obtained in:
- PMF for discrete case
- PDF for continuous case

- Probability for values less than x, P(X<x) or Probability for values within a range from a to b, P(a<X<b) can be directly obtained in:
- CDF for both discrete / continuous case

- Distribution function is referred to CDF or Cumulative Frequency Function (see this)

**In terms of Acquisition and Plot Generation Method**

- Collected data appear as discrete when:
- The measurement of a subject is naturally discrete type, such as numbers resulted from dice rolled, count of people.
- The measurement is digitized machine data, which has no intermediate values between quantized levels due to sampling process.
- In later case, when resolution higher, the measurement is closer to analog/continuous signal/variable.

- Way of generate a PMF from discrete data:
- Plot a histogram of the data for all the x's, the y-axis is the frequency or quantity at every x.
- Scale the y-axis by dividing with total number of data collected (data size) ⟶ and this is called PMF.

- Way of generate a PDF from discrete / continuous data:
- Find a continuous equation that models the collected data, let say normal distribution equation.
- Calculate the parameters required in the equation from the collected data. For example, parameters for normal distribution equation are mean and standard deviation. Calculate them from collected data.
- Based on the parameters, plot the equation with continuous x-value ⟶ that is called PDF.

- How to generate a CDF:
- In discrete case, CDF accumulates the y values in PMF at each discrete x and less than x. Repeat this for every x. The final plot is a monotonically increasing until 1 in the last x ⟶ this is called discrete CDF.
- In continuous case, integrate PDF over x; the result is a continuous CDF.

**Why PMF, PDF and CDF?**

- PMF is preferred when
- Probability at every x value is interest of study. This makes sense when studying a discrete data - such as we interest to probability of getting certain number from a dice roll.

- PDF is preferred when
- We wish to model a collected data with a continuous function, by using few parameters such as mean to speculate the population distribution.

- CDF is preferred when
- Cumulative probability in a range is point of interest.
- Especially in the case of continuous data, CDF much makes sense than PDF - e.g., probability of students' height less than 170 cm (CDF) is much informative than the probability at exact 170 cm (PDF).

**Attribution***Source : Link , Question Author : Le Chifre , Answer Author : Rócherz*