#### 1. Regression Analysis

Regression analysis is a statistical process for estimating the relationships between a dependent variable() and one or more independent variables(). That is, estimating the parameters (also called weights) in

(1)

based on some training examples .

Regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning.

Regression refer specifically to the estimation of continuous response variables, as opposed to the discrete response variables used in classification.

In linear regression, the model specification is that the dependent variable, is a linear combination of the parameters (but need not be linear in the independent variables).

#### 2. Lest Mean Squares Algorithm

There are several methods for the regression problem,

1. Lest Mean Squares

2. Bayesian methods, e.g. Bayesian linear regression. Percentage regression, for situations where reducing percentage errors is deemed more appropriate.

3. Least absolute deviations, which is more robust in the presence of outliers, leading to quantile regression

4. Nonparametric regression, requires a large number of observations and is computationally intensive

5. Distance metric learning, which is learned by the search of a meaningful distance metric in a given input space.

If the error between the real value of and the estimated has Gaussian distribution, the lest mean squares algorithm will give the most likelihood function.

We define the cost function

(2)

Our aim is to minimize by a proper chose of .

#### 3. Gradient Descent Algorithm

There are also several ways to find out the which can minimize ，

Batch gradient descent.

1. Stochastic gradient descent (or incremental gradient descent).

2. Newton’s method (or Newton-Raphson method).

3. Solve directly.

Batch gradient descent starts with some initial , and repeatedly performs the update,

(3)

(This update is simultaneously performed for all values of , is the number of , or the number of feather)

Here, is called the learning rate.

#### 4. Fit

Python models for regression,

1. StatsModels. A Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration.

2. Scikit-learn. A machine learning model can be used for classification, regression, clustering, dimensionality reduction, model selection, preprocessing.

In this case, we try to find out the in for a Given a set of training example. Clearly, this regression problem is not linear.

Our program has the following structure

1. Construct a set of where .

2. Add Gaussian distribution noise to , get the training samples .

3. Using batch gradient descent to find out the in . The updating rules is

(4)

4. Plot , , .

After 180 iterations, the change of is less than . For most of the cases, is different from , while they have almost the same line shape, no matter how many training samples we are using.