回歸分析檢驗
Regression analysis is a reliable method in statistics to determine whether a certain variable is influenced by certain other(s). The great thing about regression is also that there could be multiple variables influencing the variable of interest. Regression analysis can be used for prediction.
回歸分析是統計中確定某個變量是否受某些其他變量影響的可靠方法。 回歸的偉大之處還在于,可能會有多個變量影響目標變量。 回歸分析可用于預測。
You have to understand the two types of variables to get started with regression analysis:
您必須了解兩種類型的變量才能開始回歸分析:
Dependent variable — the variable that you want to examine, understand or predict.
因變量-您要檢查,理解或預測的變量。
Independent variable(s) — all the other variables that you hypothisize to influence the dependent variable.
自變量—您假設的所有其他變量都會影響因變量。
In order to start the regression analysis, the dependent variable should be chosen. Then the independent variable or variables should be chosen which you hypothesize to affect the dependent variable.
為了開始回歸分析,應選擇因變量。 然后應選擇一個或多個您假設會影響因變量的自變量。
The next step is obtaining data for the regression analysis. This is usually a dataset that has the identified dependent and independent variables. As an instance, if there are separate datasets available for each of the variables, the variables of interest can be extracted and combined into a new dataset.
下一步是獲取用于回歸分析的數據。 這通常是具有已標識的因變量和自變量的數據集。 例如,如果每個變量都有單獨的數據集,則可以提取感興趣的變量并將其合并到新的數據集中。

After that, the data should be plotted. The dependent variable always goes on the x-axis and the independent variable on the y-axis.
之后,應繪制數據。 因變量始終在x軸上 ,而自變量始終在y軸上 。
From the plot, initial trends and correlation can be observed that suggest what kind of relationship the dependent and independent variables have. In the example to the left, the hypothetical data points have an increasing trend. As the independent variable increases the dependent increases as well.
從圖中可以觀察到初始趨勢和相關性,它們表明因變量和自變量具有什么樣的關系。 在左側的示例中,假設的數據點呈上升趨勢。 隨著自變量的增加,因變量也隨之增加。
A trend could be observed from the plot, but what is the precise degree to which the dependent variable is influenced by the independent? A regression line should be calculated. Usually, this can be done in software like STATA or Excel. The regression line is the best approximation of the data points on the plot.
從圖中可以觀察到趨勢,但是因變量在多大程度上受到自變量的影響? 應該計算一條回歸線。 通常,這可以在STATA或Excel之類的軟件中完成。 回歸線是圖中數據點的最佳近似值。
In other words, explains Redman, “The red line is the best explanation of the relationship between the independent variable and dependent variable.”
換句話說,雷德曼解釋說:“紅線是對自變量和因變量之間關系的最好解釋。”

計算回歸線 (Calculating the regression line)
Calculating a regression line means finding a best-fit line for all the data points. For simple linear regression analysis, usually, the least-squares method is used.
計算回歸線意味著找到所有數據點的最佳擬合線。 對于簡單的線性回歸分析,通常使用最小二乘法。
The linear regression line is a simple line of the form y=mx+b. In order to find the best-fit line for your data you need to first find the five summary statistics:
線性回歸線是形式為y = mx + b的簡單線。 為了找到最適合您的數據的行,您需要首先找到五個匯總統計信息:
Mean of the x values
x值的平均值

2. Mean of the y values
2. y值的平均值

3. The standard deviation of the x values (denoted sx)
3. x值的標準偏差(表示為sx )
4. The standard deviation of the y values (denoted sy)
4. y值的標準偏差(表示為sy )
5. The correlation between X and Y (denoted r)
5. X和Y之間的相關性(表示為r )
The formula for calculating the slope m of the regression line is the following:
回歸線的斜率m的計算公式如下:

This formula calculates the slope for the regression line equation of the form y=mx+b. Now the last part to calculate is the y-intercept b. It can be calculated using the formula below:
該公式計算形式為y = mx + b的回歸線方程的斜率。 現在要計算的最后一部分是y截距 b 。 可以使用以下公式計算:

are the means of the x values and y values respectively and m is the already calculated slope.
分別是x值和y值的均值, m是已經計算出的斜率。
The regression line that Excel will produce for example will look something like y=6x+70+error_term. This is different from the simple regression line that we calculated in that it has an error_term.
例如,Excel將產生的回歸線將類似于y = 6x + 70 + error_term 。 這與我們計算的簡單回歸線不同,它具有error_term 。
Regression lines always consider an error term because in reality, independent variables are never precisely perfect predictors of dependent variables.
回歸線總是考慮一個誤差項,因為實際上,自變量從來都不是因變量的精確預測器。
In reality, the dependent term might be determined by a number of different factors. The regression line is only an estimate based on the data available to you and the larger the error term is the less definitely certain your regression line is.
實際上,從屬項可能由許多不同的因素決定。 回歸線只是基于您可用數據的估計值,誤差項越大,確定線越不確定。
結論 (Conclusion)
Regression analysis helps determine effect of some variables on another. It is widely used in business analysis for determining different factors that influence the target variable and predict its future values.
回歸分析有助于確定某些變量對另一個變量的影響。 它廣泛用于業務分析中,以確定影響目標變量并預測其未來價值的不同因素。
We’ve discussed what regression analysis is and how to calculate the regression line.
我們已經討論了什么是回歸分析以及如何計算回歸線。
翻譯自: https://medium.com/swlh/regression-analysis-86e6a8bee0b7
回歸分析檢驗
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/389190.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/389190.shtml 英文地址,請注明出處:http://en.pswp.cn/news/389190.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!