◀️ 🌱 July 🌱 ▶️
일	월	금	토
		0	0
0	0	0	0
0	0	0	0
0	0	0	0
0	0

[빅데이터분석기사 실기] 제8회 기출 변형 문제 (제3유형)

2024. 11. 15. 02:14

728x90

제8회 기출 변형 문제 (제3유형)

들어가며

빅데이터분석기사 실기 제8회 제3유형 기출 변형 문제를 올려본다.
제8회 제3유형에서는 고급 통계(회귀 분석)과 관련된 문제가 출제되었다.

참고

회귀 분석에서는 귀무 가설과 대립 가설이 다음과 같이 설정된다.
- 따라서 유의하지 않은 변수를 구하려면 귀무 가설을 채택(p-value > 0.05(유의수준))하는 변수를 선택하면 된다.

귀무 가설 : 해당 변수는 종속 변수에 미치는 영향력이 없다. (유의하지 않다.)
대립 가설 : 해당 변수는 종속 변수에 미치는 영향력이 있다. (유의하다.)

로지스틱 회귀 분석은 statsmodels.api.Logit(y, X) 함수를 사용하고, 다중 선형 회귀 분석은 statsmodels.api.OLS(y, X) 함수를 사용한다.
p-value 값은 result.pvalues로, 회귀 계수(Coefficient) 값은 result.params을 통해 확인할 수 있다.
오즈비(Odds Ratio)는 np.exp(result['변수'])를 이용하여 구할 수 있다.
- 변수의 크기가 N 증가할 때, 오즈비(Odds Ratio)는 np.exp(N * np.exp(result['변수'])) 배수만큼 증가한다.
- result['변수']는 변수의 회귀 계수(Coefficient) 값이다.
로지스틱 회귀 분석에서 잔차 이탈도(Residual Deviance)는 GLM(Generalized Linear Model)을 통해 확인할 수 있다.
- statsmodels.api.GLM(y, X, family=sm.families.Binomial())

사용 예시 코드


			
			
			
		
import numpy as np
import pandas as pd
import statsmodels.api as sm
 
df = pd.read_csv('./datasets/dataset.csv')
 
# 각 변수 간의 상관 계수 확인 하기
corr_df = df.corr(numeric_only=True)
print(corr_df)
 
""" 출력 예시
              v1        v2        v3        v4        v5    Target
v1      1.000000 -0.066107 -0.036512  0.021895 -0.005389  0.872265
v2     -0.066107  1.000000 -0.124047 -0.061987 -0.114401 -0.383313
v3     -0.036512 -0.124047  1.000000 -0.115117  0.062719  0.310146
v4      0.021895 -0.061987 -0.115117  1.000000  0.033356 -0.011590
v5     -0.005389 -0.114401  0.062719  0.033356  1.000000  0.036221
Target  0.872265 -0.383313  0.310146 -0.011590  0.036221  1.000000
"""
 
##
## ===========================================================================
##
 
X = df[['변수1', '변수2', ...]]   # 독립 변수
X = sm.add_constant(X)            # 상수항 추가
 
y = df[['변수1']]                 # 종속 변수
 
# ✅ 로지스틱 회귀
model = sm.Logit(y, X).fit()
 
# Summary 확인
summary = model.summary()
 
""" 출력 예시
                           Logit Regression Results                           
==============================================================================
Dep. Variable:                y_logit   No. Observations:                  100
Model:                          Logit   Df Residuals:                       96
Method:                           MLE   Df Model:                            3
Date:                Fri, 15 Nov 2024   Pseudo R-squ.:                 0.01884
Time:                        02:25:50   Log-Likelihood:                -66.748
converged:                       True   LL-Null:                       -68.029
Covariance Type:            nonrobust   LLR p-value:                    0.4640
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          1.5921      1.624      0.980      0.327      -1.591       4.775
X1            -0.0368      0.024     -1.540      0.124      -0.084       0.010
X2            -0.0020      0.015     -0.136      0.892      -0.031       0.027
X3          -4.16e-05      0.010     -0.004      0.997      -0.019       0.019
==============================================================================
"""
 
# ✅ 다중 선형 회귀
model = sm.OLS(y, X).fit()
 
# Summary 확인
summary = model.summary()
 
""" 출력 예시
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  y_ols   R-squared:                       0.989
Model:                            OLS   Adj. R-squared:                  0.988
Method:                 Least Squares   F-statistic:                     2795.
Date:                Fri, 15 Nov 2024   Prob (F-statistic):           2.99e-93
Time:                        02:25:50   Log-Likelihood:                -304.81
No. Observations:                 100   AIC:                             617.6
Df Residuals:                      96   BIC:                             628.0
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          4.2031      4.072      1.032      0.305      -3.880      12.286
X1             1.9530      0.059     32.994      0.000       1.836       2.071
X2             2.9656      0.037     80.338      0.000       2.892       3.039
X3            -0.9938      0.025    -40.439      0.000      -1.043      -0.945
==============================================================================
Omnibus:                        2.128   Durbin-Watson:                   1.690
Prob(Omnibus):                  0.345   Jarque-Bera (JB):                1.502
Skew:                          -0.040   Prob(JB):                        0.472
Kurtosis:                       2.405   Cond. No.                         836.
==============================================================================
"""
 
# p-value 확인
p_values = model.pvalues
 
""" 출력 예시
const    3.045914e-01
X1       3.508077e-54
X2       7.613901e-90
X3       4.249762e-62
dtype: float64
"""
 
# 회귀 계수(coef) 확인
coefficients = model.params
 
""" 출력 예시
const    4.203092
X1       1.953025
X2       2.965557
X3      -0.993764
dtype: float64
"""
 
# 결정 계수(R-Squared) 확인
r_squared = model.rsquared
 
""" 출력 예시
0.046868286704138895
"""
 
# 로짓 우도값(Log-Likelihood) 확인
log_likelihood = model.llf
 
""" 출력 예시
-135.22741810901363
"""
 
# 오즈비(Odds Ratio) 구하기
odds_ratio = np.exp(model.params['변수'])
 
""" 출력 예시
0.072313213120958013
"""
 
# 잔차 이탈도(Residual Deviance) 확인
model = sm.GLM(y, X, family=sm.families.Binomial()).fit()   # ✅ GLM(Generalized Linear Model) 모델링, Bionomial : 로지스틱 회귀
 
# Summary 확인
summary = model.summary()
 
""" 출력 예시
                 Generalized Linear Model Regression Results                  
==============================================================================
Dep. Variable:                 target   No. Observations:                  200
Model:                            GLM   Df Residuals:                      194
Model Family:                Binomial   Df Model:                            5
Link Function:                  Logit   Scale:                          1.0000
Method:                          IRLS   Log-Likelihood:                -135.23
Date:                Sat, 16 Nov 2024   Deviance:                       270.45
Time:                        20:07:33   Pearson chi2:                     200.
No. Iterations:                     4   Pseudo R-squ. (CS):           0.004967
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const         -0.9495      1.424     -0.667      0.505      -3.740       1.841
age            0.0011      0.010      0.107      0.915      -0.019       0.021
chol           0.0012      0.003      0.370      0.712      -0.005       0.008
trestbps       0.0029      0.006      0.470      0.638      -0.009       0.015
thalach       -0.0019      0.005     -0.376      0.707      -0.012       0.008
oldpeak        0.0549      0.102      0.537      0.591      -0.145       0.255
==============================================================================
"""
 
residual_deviance = model.deviance
 
""" 출력 예시
270.45483621802725
"""

문제 1

(1) 로지스틱 회귀 모델을 적용하여 유의하지 않은 변수 개수 구하기

(2) 유의미한 변수만을 독립 변수로 하여 로지스틱 회귀를 다시 적용하고, 회귀 계수의 평균 구하기

(3) calls 변수가 5 증가할 때 오즈비 증가 배수 계산하기


			
			
			
		
import pandas as pd
import numpy as np
import statsmodels.api as sm
 
# 예시 데이터 생성
np.random.seed(23123123)
data = pd.DataFrame({
    'age': np.random.randint(18, 60, 100),
    'salary': np.random.randint(30000, 120000, 100),
    'calls': np.random.randint(0, 50, 100),
    'churn': np.random.randint(0, 2, 100)  # 고객이탈지수 (종속변수)
})
 
# (1) 로지스틱 회귀 모델을 적용하여 유의하지 않은 변수 개수 구하기
 
# 상수항 추가
X = data[['age', 'salary', 'calls']]
X = sm.add_constant(X)  # 상수항 추가
y = data['churn']
 
# 로지스틱 회귀 모델 적합
model = sm.Logit(y, X).fit()
 
summary = model.summary()
print(summary)
 
# p-value 확인
p_values = model.pvalues
 
# 유의미하지 않은 변수 (p-value > 0.05)
insignificant_vars = p_values[p_values > 0.05]
print(f"유의하지 않은 변수의 개수: {len(insignificant_vars)}")
 
# (2) 유의미한 변수만을 독립 변수로 하여 로지스틱 회귀를 다시 적용하고 회귀 계수의 평균 구하기
significant_vars = p_values[p_values <= 0.05].index  # 유의미한 변수 추출 (p-value <= 유의수준, 상수항 포함)
 
# 유의미한 변수가 있는지 확인
if len(significant_vars) > 0:
    X_significant = X[significant_vars]
    logit_model_significant = sm.Logit(y, X_significant).fit()
 
    # 회귀 계수의 평균 구하기
    coefficients = logit_model_significant.params
    mean_coefficient = coefficients.mean()
    print(f"유의미한 변수들에 대한 회귀 계수 평균: {mean_coefficient}")
else:
    print("유의미한 변수가 없습니다.")
 
# (3) 'calls' 변수가 5 증가할 때 오즈비 증가 배수 계산
if 'calls' in significant_vars:
    call_coeff = logit_model_significant.params['calls']
    odds_ratio_increase = np.exp(5 * call_coeff)
    print(f"calls 변수가 5 증가하면 오즈비는 {odds_ratio_increase}배 증가합니다.")
else:
    print("'calls' 변수는 유의미하지 않거나 모델에 포함되지 않았습니다.")


			
			
			
		
Optimization terminated successfully.
         Current function value: 0.630233
         Iterations 5
                           Logit Regression Results                           
==============================================================================
Dep. Variable:                  churn   No. Observations:                  100
Model:                          Logit   Df Residuals:                       96
Method:                           MLE   Df Model:                            3
Date:                Fri, 15 Nov 2024   Pseudo R-squ.:                 0.09077
Time:                        01:46:47   Log-Likelihood:                -63.023
converged:                       True   LL-Null:                       -69.315
Covariance Type:            nonrobust   LLR p-value:                  0.005632
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          3.2608      1.060      3.075      0.002       1.182       5.339
age           -0.0345      0.019     -1.792      0.073      -0.072       0.003
salary     -1.611e-05   8.89e-06     -1.812      0.070   -3.35e-05    1.32e-06
calls         -0.0301      0.015     -2.026      0.043      -0.059      -0.001
==============================================================================
유의하지 않은 변수의 개수: 2
Optimization terminated successfully.
         Current function value: 0.669250
         Iterations 4
유의미한 변수들에 대한 회귀 계수 평균: 0.35029495471365896
calls 변수가 5 증가하면 오즈비는 0.8589494047822301배 증가합니다.

(1) 회귀식에서 귀무 가설은 '해당 변수는 종속 변수에 미치는 영향이 없다.' 이다. 따라서 귀무 가설이 채택될 경우(p-value > 유의수준(0.05)), 유의하지 않다고 할 수 있다. 또한, 대립 가설이 채택될 경우(p-value < 유의수준(0.05)) 해당 변수는 유의하다고 볼 수 있다. 유의 하지 않은(p-value > 유의수준(0.05)) 변수는 age, salary 2개이다.

(2) 회귀 계수의 평균을 구할 때, 상수항을 포함시켜서 구한다.

(3) np.exp(5 * call_coeff)와 같이 계산한다.

문제 2

(1) 다중 선형 회귀 모델을 적용하여 가장 유의미한 변수의 회귀 계수 구하기

(2) 결정 계수 구하기

(3) size=200, location=5, age=10 일 때 house_price 값 예측하기


			
			
			
		
import pandas as pd
import statsmodels.api as sm
import numpy as np
 
# 예시 데이터 생성
np.random.seed(123123)
data = pd.DataFrame({
    'size': np.random.randint(50, 300, 100),  # 주택 크기 (평수)
    'location': np.random.randint(1, 10, 100),  # 위치 (1~10)
    'age': np.random.randint(1, 50, 100),  # 나이 (년)
    'house_price': np.random.randint(100000, 1000000, 100)  # 주택 가격
})
 
# (1) 다중선형 회귀 모델을 적용하여 가장 유의미한 변수의 회귀계수 구하기
x = data[['size', 'location', 'age']]
y = data['house_price']
 
# 상수항 추가
x = sm.add_constant(x)
 
# 다중선형 회귀 모델 적합
model = sm.OLS(y, x).fit()
 
# 회귀 결과 출력
summary = model.summary()
print("회귀 결과 요약:")
print(summary)
 
# (2) 결정 계수 구하기
r_squared = model.rsquared
print(f"결정 계수 (R-squared): {r_squared}")
 
# (3) size=200, location=5, age=10 일 때 house_price 값 예측
size = 1
location = 5
age = 10
 
predicted_price = model.predict([1, size, location, age])  # 상수항(1)을 포함하여 예측
print(f"예측된 house_price: {predicted_price[0]}")


			
			
			
		
회귀 결과 요약:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:            house_price   R-squared:                       0.047
Model:                            OLS   Adj. R-squared:                  0.017
Method:                 Least Squares   F-statistic:                     1.574
Date:                Fri, 15 Nov 2024   Prob (F-statistic):              0.201
Time:                        01:56:00   Log-Likelihood:                -1377.9
No. Observations:                 100   AIC:                             2764.
Df Residuals:                      96   BIC:                             2774.
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const       7.002e+05   8.77e+04      7.988      0.000    5.26e+05    8.74e+05
size        -676.8792    319.354     -2.120      0.037   -1310.792     -42.967
location   -3044.8104   8910.179     -0.342      0.733   -2.07e+04    1.46e+04
age          216.6799   1701.382      0.127      0.899   -3160.536    3593.895
==============================================================================
Omnibus:                        9.060   Durbin-Watson:                   2.075
Prob(Omnibus):                  0.011   Jarque-Bera (JB):                3.390
Skew:                           0.038   Prob(JB):                        0.184
Kurtosis:                       2.101   Cond. No.                         703.
==============================================================================
 
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
결정 계수 (R-squared): 0.046868286704138895
예측된 house_price: 551755.6794774851

(1) size 변수의 p-value가 유의수준(0.05)보다 작으므로, 대립 가설(해당 변수는 종속 변수에 영향을 미친다.)를 채택하게 된다. 따라서 가장 유의미한 변수는 size이며, 이때 회귀 계수는 -676.8792이다.

(2) model.rsquared를 사용하여 결정 계수 값을 구할 수 있다. (summary() 결과의 R-squared 지표를 확인해도 된다.)

(3) model.predict()를 이용하여 예측한다. 이때 상수항의 값(1)을 포함시켜준다.

728x90

저작자표시 비영리 변경금지 (새창열림)

'Certificate > 빅데이터분석기사' 카테고리의 다른 글

[빅데이터분석기사 실기] 제1유형 시험 준비 (0)	2024.11.25
[빅데이터분석기사 실기] help(), dir() 활용하기 (0)	2024.11.25
[빅데이터분석기사 실기] corr() 함수와 numeric_only 옵션 (0)	2024.11.25
[빅데이터분석기사 실기] 시험장 들어가기 전에 보기 빠르게 보기 좋은 강의 모음 (1)	2024.11.17
[빅데이터분석기사 실기] 제6회 기출 변형 문제 (제3유형) (0)	2024.11.16
[빅데이터분석기사 실기] 제7회 기출 변형 문제 (제3유형) (0)	2024.11.16
[빅데이터분석기사 실기] 제3유형 시험 준비 (가설 검정, 고급 통계) (6)	2024.11.15
[빅데이터분석기사 실기] 판다스(pandas) 출력 길이 제한 해제하기 (0)	2024.10.27

Per ardua ad astra."Hello, World!" 🤖

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Per ardua ad astra.

"Hello, Worl

[빅데이터분석기사 실기] 제8회 기출 변형 문제 (제3유형)

제8회 기출 변형 문제 (제3유형)

들어가며

참고

문제 1

문제 2

'Certificate > 빅데이터분석기사' 카테고리의 다른 글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

	import numpy as np
	import pandas as pd
	import statsmodels.api as sm

	df = pd.read_csv('./datasets/dataset.csv')

	# 각 변수 간의 상관 계수 확인 하기
	corr_df = df.corr(numeric_only=True)
	print(corr_df)

	""" 출력 예시
	v1 v2 v3 v4 v5 Target
	v1 1.000000 -0.066107 -0.036512 0.021895 -0.005389 0.872265
	v2 -0.066107 1.000000 -0.124047 -0.061987 -0.114401 -0.383313
	v3 -0.036512 -0.124047 1.000000 -0.115117 0.062719 0.310146
	v4 0.021895 -0.061987 -0.115117 1.000000 0.033356 -0.011590
	v5 -0.005389 -0.114401 0.062719 0.033356 1.000000 0.036221
	Target 0.872265 -0.383313 0.310146 -0.011590 0.036221 1.000000
	"""

	##
	## ===========================================================================
	##

	X = df[['변수1', '변수2', ...]] # 독립 변수
	X = sm.add_constant(X) # 상수항 추가

	y = df[['변수1']] # 종속 변수

	# ✅ 로지스틱 회귀
	model = sm.Logit(y, X).fit()

	# Summary 확인
	summary = model.summary()

	""" 출력 예시
	Logit Regression Results
	==============================================================================
	Dep. Variable: y_logit No. Observations: 100
	Model: Logit Df Residuals: 96
	Method: MLE Df Model: 3
	Date: Fri, 15 Nov 2024 Pseudo R-squ.: 0.01884
	Time: 02:25:50 Log-Likelihood: -66.748
	converged: True LL-Null: -68.029
	Covariance Type: nonrobust LLR p-value: 0.4640
	==============================================================================
	coef std err z P>\|z\| [0.025 0.975]
	------------------------------------------------------------------------------
	const 1.5921 1.624 0.980 0.327 -1.591 4.775
	X1 -0.0368 0.024 -1.540 0.124 -0.084 0.010
	X2 -0.0020 0.015 -0.136 0.892 -0.031 0.027
	X3 -4.16e-05 0.010 -0.004 0.997 -0.019 0.019
	==============================================================================
	"""

	# ✅ 다중 선형 회귀
	model = sm.OLS(y, X).fit()

	# Summary 확인
	summary = model.summary()

	""" 출력 예시
	OLS Regression Results
	==============================================================================
	Dep. Variable: y_ols R-squared: 0.989
	Model: OLS Adj. R-squared: 0.988
	Method: Least Squares F-statistic: 2795.
	Date: Fri, 15 Nov 2024 Prob (F-statistic): 2.99e-93
	Time: 02:25:50 Log-Likelihood: -304.81
	No. Observations: 100 AIC: 617.6
	Df Residuals: 96 BIC: 628.0
	Df Model: 3
	Covariance Type: nonrobust
	==============================================================================
	coef std err t P>\|t\| [0.025 0.975]
	------------------------------------------------------------------------------
	const 4.2031 4.072 1.032 0.305 -3.880 12.286
	X1 1.9530 0.059 32.994 0.000 1.836 2.071
	X2 2.9656 0.037 80.338 0.000 2.892 3.039
	X3 -0.9938 0.025 -40.439 0.000 -1.043 -0.945
	==============================================================================
	Omnibus: 2.128 Durbin-Watson: 1.690
	Prob(Omnibus): 0.345 Jarque-Bera (JB): 1.502
	Skew: -0.040 Prob(JB): 0.472
	Kurtosis: 2.405 Cond. No. 836.
	==============================================================================
	"""

	# p-value 확인
	p_values = model.pvalues

	""" 출력 예시
	const 3.045914e-01
	X1 3.508077e-54
	X2 7.613901e-90
	X3 4.249762e-62
	dtype: float64
	"""

	# 회귀 계수(coef) 확인
	coefficients = model.params

	""" 출력 예시
	const 4.203092
	X1 1.953025
	X2 2.965557
	X3 -0.993764
	dtype: float64
	"""

	# 결정 계수(R-Squared) 확인
	r_squared = model.rsquared

	""" 출력 예시
	0.046868286704138895
	"""

	# 로짓 우도값(Log-Likelihood) 확인
	log_likelihood = model.llf

	""" 출력 예시
	-135.22741810901363
	"""

	# 오즈비(Odds Ratio) 구하기
	odds_ratio = np.exp(model.params['변수'])

	""" 출력 예시
	0.072313213120958013
	"""

	# 잔차 이탈도(Residual Deviance) 확인
	model = sm.GLM(y, X, family=sm.families.Binomial()).fit() # ✅ GLM(Generalized Linear Model) 모델링, Bionomial : 로지스틱 회귀

	# Summary 확인
	summary = model.summary()

	""" 출력 예시
	Generalized Linear Model Regression Results
	==============================================================================
	Dep. Variable: target No. Observations: 200
	Model: GLM Df Residuals: 194
	Model Family: Binomial Df Model: 5
	Link Function: Logit Scale: 1.0000
	Method: IRLS Log-Likelihood: -135.23
	Date: Sat, 16 Nov 2024 Deviance: 270.45
	Time: 20:07:33 Pearson chi2: 200.
	No. Iterations: 4 Pseudo R-squ. (CS): 0.004967
	Covariance Type: nonrobust
	==============================================================================
	coef std err z P>\|z\| [0.025 0.975]
	------------------------------------------------------------------------------
	const -0.9495 1.424 -0.667 0.505 -3.740 1.841
	age 0.0011 0.010 0.107 0.915 -0.019 0.021
	chol 0.0012 0.003 0.370 0.712 -0.005 0.008
	trestbps 0.0029 0.006 0.470 0.638 -0.009 0.015
	thalach -0.0019 0.005 -0.376 0.707 -0.012 0.008
	oldpeak 0.0549 0.102 0.537 0.591 -0.145 0.255
	==============================================================================
	"""

	residual_deviance = model.deviance

	""" 출력 예시
	270.45483621802725
	"""

	Optimization terminated successfully.
	Current function value: 0.630233
	Iterations 5
	Logit Regression Results
	==============================================================================
	Dep. Variable: churn No. Observations: 100
	Model: Logit Df Residuals: 96
	Method: MLE Df Model: 3
	Date: Fri, 15 Nov 2024 Pseudo R-squ.: 0.09077
	Time: 01:46:47 Log-Likelihood: -63.023
	converged: True LL-Null: -69.315
	Covariance Type: nonrobust LLR p-value: 0.005632
	==============================================================================
	coef std err z P>\|z\| [0.025 0.975]
	------------------------------------------------------------------------------
	const 3.2608 1.060 3.075 0.002 1.182 5.339
	age -0.0345 0.019 -1.792 0.073 -0.072 0.003
	salary -1.611e-05 8.89e-06 -1.812 0.070 -3.35e-05 1.32e-06
	calls -0.0301 0.015 -2.026 0.043 -0.059 -0.001
	==============================================================================
	유의하지 않은 변수의 개수: 2
	Optimization terminated successfully.
	Current function value: 0.669250
	Iterations 4
	유의미한 변수들에 대한 회귀 계수 평균: 0.35029495471365896
	calls 변수가 5 증가하면 오즈비는 0.8589494047822301배 증가합니다.

	회귀 결과 요약:
	OLS Regression Results
	==============================================================================
	Dep. Variable: house_price R-squared: 0.047
	Model: OLS Adj. R-squared: 0.017
	Method: Least Squares F-statistic: 1.574
	Date: Fri, 15 Nov 2024 Prob (F-statistic): 0.201
	Time: 01:56:00 Log-Likelihood: -1377.9
	No. Observations: 100 AIC: 2764.
	Df Residuals: 96 BIC: 2774.
	Df Model: 3
	Covariance Type: nonrobust
	==============================================================================
	coef std err t P>\|t\| [0.025 0.975]
	------------------------------------------------------------------------------
	const 7.002e+05 8.77e+04 7.988 0.000 5.26e+05 8.74e+05
	size -676.8792 319.354 -2.120 0.037 -1310.792 -42.967
	location -3044.8104 8910.179 -0.342 0.733 -2.07e+04 1.46e+04
	age 216.6799 1701.382 0.127 0.899 -3160.536 3593.895
	==============================================================================
	Omnibus: 9.060 Durbin-Watson: 2.075
	Prob(Omnibus): 0.011 Jarque-Bera (JB): 3.390
	Skew: 0.038 Prob(JB): 0.184
	Kurtosis: 2.101 Cond. No. 703.
	==============================================================================

	Notes:
	[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
	결정 계수 (R-squared): 0.046868286704138895
	예측된 house_price: 551755.6794774851