機器學習股票_使用概率機器學習來改善您的股票交易

機器學習股票

Note from Towards Data Science’s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works without seeking professional advice. See our Reader Terms for details.

Towards Data Science編輯的注意事項: 盡管我們允許獨立作者按照我們的 規則和指南 發表文章 ,但我們不認可每位作者的貢獻。 您不應在未征求專業意見的情況下依賴作者的作品。 有關 詳細信息, 請參見我們的 閱讀器條款

Probabilistic Machine Learning comes hand in hand with Stock Trading: Probabilistic Machine Learning uses past instances to predict probabilities of certain events happening in future instances. This can be directly applied to stock trading, to predict future stock prices.

概率機器學習與股票交易息息相關:概率機器學習使用過去的實例來預測未來實例中發生的某些事件的概率。 這可以直接應用于股票交易,以預測未來的股票價格。

這個概念: (The Concept:)

This program will use Gaussian Naive Bayes to classify data into increasing stock price, or decreasing stock price.

該程序將使用高斯樸素貝葉斯將數據分類為股票價格上漲或股票價格下跌。

Because of the volatility of the stocks, I will not be using the closing price of the stock to predict it, but rather be using the ratio between the past and current closing prices. To understand how the program works, we must first understand the underling algorithm at play:

由于股票的波動性,我將不會使用股票的收盤價來預測它,而是會使用過去和當前收盤價之間的比率。 要了解程序的工作方式,我們必須首先了解實際的基礎算法:

什么是高斯樸素貝葉斯分類器? (What is Gaussian Naive Bayes Classifier?)

Gaussian Naive Bayes is an algorithm that classifies data by extrapolating data using Gaussian Distribution (identical to Normal Distribution) as well as Bayes theorem.

高斯樸素貝葉斯算法是一種算法,它通過使用高斯分布(與正態分布相同)以及貝葉斯定理外推數據來對數據進行分類。

優點: (Advantages:)

  • Works on small datasets

    適用于小型數據集

Unlike traditional neural networks in which each neuron was directly connected to every other neuron, the probabilities are assumed to be independent.

與傳統的神經網絡不同,在傳統的神經網絡中,每個神經元都直接與每個其他神經元相連,因此概率被認為是獨立的。

  • Not computationally intensive

    不需要大量計算

Since the Naive Bayes Classifier is deterministic, The parameters for the Naive Bayes Classifier does not change every iteration, unlike the weights that power a Neural Network. This makes the algorithm much less computationally intensive.

由于樸素貝葉斯分類器是確定性的,因此與樸素的神經網絡權重不同,樸素貝葉斯分類器的參數不會每次迭代都更改。 這使算法的計算強度大大降低。

缺點: (Disadvantages:)

  • Fails at learning Big Data

    學習大數據失敗

The complex mapping of a Neural Network outmatches the simple architecture of the Naive Bayes Algorithm when the data is enough to optimize all the parameters.

當數據足以優化所有參數時,神經網絡的復雜映射將不滿足樸素貝葉斯算法的簡單體系結構。

代碼: (The Code:)

With a better understanding of how the Gaussian Naive Bayes algorithm works, let’s get to the program:

更好地了解了高斯樸素貝葉斯算法的工作原理,讓我們進入程序:

步驟1 | 先決條件: (Step 1| Prerequisites:)

import yfinance
from scipy import statsaapl = yfinance.download('AAPL','2016-1-1','2020-1-1')

These are the two libraries that I will use for the project: yfinance is for downloading stock data and scipy is to create gaussian distributions.

這是我將用于該項目的兩個庫:yfinance用于下載股票數據,scipy用于創建高斯分布。

I downloaded Apple stock data, from 2016 to 2020, for reproducible results.

我下載了2016年至2020年的Apple股票數據,以獲得可重復的結果。

Step 2| Converting to Gaussian Distributions:

步驟2 | 轉換為高斯分布:

def calculate_prereq(values):
std = np.std(values)
mean = np.mean(values)
return std,meandef calculate_distribution(mean,std):
norm = stats.norm(mean, std)
return normdef extrapolate(norm,x):
return norm.pdf(x)def values_to_norm(dicts):
for dictionary in dicts:
for term in dictionary:
std,mean = calculate_prereq(dictionary[term])
norm = calculate_distribution(mean,std)
dictionary[term] = norm
return dicts

The “calculate_prereq” function helps to calculate the standard deviation and the mean: The two things needed to create a Gaussian distribution.

“ calculate_prereq”函數有助于計算標準偏差和均值:創建高斯分布所需的兩件事。

I would make the function to create a Gaussian distribution from scratch, but scipy’s functions have been highly optimized and would therefore work better on datasets with more features.

我將使用該函數從頭開始創建高斯分布,但是scipy的函數已經過高度優化,因此可以在具有更多特征的數據集上更好地工作。

Gaussian distributions are approximations of general probabilistic data. Take the example of the IQ test spectrum. Most people have an average IQ score of 100. Therefore, the peak of the Gaussian distribution would be at 100. On both ends of the spectrum, the number of people getting extremely low and extremely high scores decrease as the scores become more extreme. With a Gaussian distribution, one can extrapolate a probability of a person getting a certain value and therefore gain insight on it.

高斯分布是一般概率數據的近似值。 以IQ測試頻譜為例。 大多數人的平均智商得分為100。因此,高斯分布的峰值將為100。在光譜的兩端,得分變得越來越低,變得越來越低的人數也越來越少。 使用高斯分布,可以推斷一個人獲得某個價值的概率,從而獲得對價值的洞察力。

步驟3 | 比較可能性: (Step 3| Compare Possibilities:)

def compare_possibilities(dicts,x):
probabilities = []
for dictionary in dicts:
dict_probs = []
for i in range(len(x)):
value = x[i]
dict_probs.append(extrapolate(dictionary[i],value))
probabilities.append(np.prod(dict_probs))
return probabilities.index(max(probabilities))

This function simply runs through the dictionaries (the different classes) and calculates the probability of the price increasing or dropping, given the ratios between the price of the last ten days. It then returns an index in the list of dictionaries the class that the Bayes Classifier calculates to have the highest probability.

該函數僅遍歷字典(不同類別),并根據最近十天價格之間的比率來計算價格上漲或下跌的概率。 然后,它返回字典列表中的索引,該字典是貝葉斯分類器計算出的具有最高概率的類。

步驟4 | 運行程序: (Step 4| Run the Program:)

drop = {}
increase = {}
for day in range(10,len(aapl)-1):
previous_close = aapl['Close'][day-10:day]
ratios = []
for i in range(1,len(previous_close)):
ratios.append(previous_close[i]/previous_close[i-1])
if aapl['Close'][day+1] > aapl['Close'][day]:
for i in range(len(ratios)):
if i in increase:
increase[i] += (ratios[i],)
else:
increase[i] = ()
elif aapl['Close'][day+1] < aapl['Close'][day]:
for i in range(len(ratios)):
if i in drop:
drop[i] += (ratios[i],)
else:
drop[i] = ()
new_close = aapl['Close'][-11:-1]
ratios = []
for i in range(1,len(new_close)):
ratios.append(new_close[i]/new_close[i-1])
for i in range(len(ratios)):
if i in increase:
increase[i] += (ratios[i],)
else:
increase[i] = ()
X = ratios
print(X)
dicts = [increase,drop]
dicts = values_to_norm(dicts)
compare_possibilities(dicts,X)

This last part runs all the functions together, and gathers the 9 ratios for the stock price in the last 10 days. It then executes the program and returns if the price will increase, or drop. The value it returns is the index of the dictionary in the list dicts. If it is 1, the price is predicted to drop. If it is 0, the price is predicted to increase.

最后一部分將所有功能運行在一起,并收集了最近10天股票價格的9個比率。 然后,它執行程序并返回價格是否上漲或下跌。 它返回的值是列表字典中字典的索引。 如果為1,則價格預計會下降。 如果為0,則預計價格會上漲。

結論: (Conclusion:)

This program is just the basic framework of a Gaussian Naive Bayes algorithm. Here are a few ways that you can improve my program:

該程序只是高斯樸素貝葉斯算法的基本框架。 您可以通過以下幾種方法來改進我的程序:

  • Increase the number of features

    增加功能數量

You can include features such as volume and opening price, to increase the scope of the data. However, an overload of data could cause Gaussian Naive Bayes to be less effective, as it does not perform well with big data.

您可以包括數量和開盤價之類的功能,以擴大數據范圍。 但是,數據過載可能會導致高斯樸素貝葉斯效率降低,因為它在大數據方面表現不佳。

  • Link to Alpaca API

    鏈接到Alpaca API

The alpaca API is a great platform to test trading strategies. Try linking this program to make buy or sell trades, based on the predictions of the model!

羊駝API是測試交易策略的絕佳平臺。 根據模型的預測,嘗試鏈接此程序以進行買賣交易!

翻譯自: https://medium.com/analytics-vidhya/using-probabilistic-machine-learning-to-improve-your-stock-trading-b40782f3710d

機器學習股票

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/389552.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/389552.shtml
英文地址,請注明出處:http://en.pswp.cn/news/389552.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

BZOJ 2818 Gcd

傳送門 題解&#xff1a;設p為素數 &#xff0c;則gcd(x/p,y/p)1也就是說求 x&#xff0f;p以及 y&#xff0f;p的歐拉函數。歐拉篩前綴和就可以解決 #include <iostream> #include <cstdio> #include <cmath> #include <algorithm> #include <map&…

LeetCode387-字符串中的第一個唯一字符(查找,自定義數據結構)

一開始想用HashMap&#xff0c;把每個字符放進去&#xff0c;然后統計出現的次數。 使用LinkedHashMap的話&#xff0c;鍵值對的順序都是不會變的。 LinkedHashMap<Character,Integer> map new LinkedHashMap<>();map.put(i,1111);map.put(j,2222);map.put(k,3333…

r psm傾向性匹配_南瓜香料指標psm如何規劃季節性廣告

r psm傾向性匹配Retail managers have been facing an extraordinary time with the COVID-19 pandemic. But the typical plans to prepare for seasonal sales will be a new challenge. More seasonal products have been introduced over the years, making August the bes…

主成分分析:PCA的思想及鳶尾花實例實現

主成份分析算法PCA 非監督學習算法 PCA的實現&#xff1a; 簡單來說&#xff0c;就是將數據從原始的空間中轉換到新的特征空間中&#xff0c;例如原始的空間是三維的(x,y,z)&#xff0c;x、y、z分別是原始空間的三個基&#xff0c;我們可以通過某種方法&#xff0c;用新的坐…

兩家大型網貸平臺竟在借款人審核問題上“偷懶”?

python信用評分卡&#xff08;附代碼&#xff0c;博主錄制&#xff09; https://study.163.com/course/introduction.htm?courseId1005214003&utm_campaigncommission&utm_sourcecp-400000000398149&utm_mediumshare 放貸流量增加&#xff0c;逾期率也會隨之增加&…

解決 Alfred 每次開機都提示請求通訊錄權限的問題

安裝完 Alfred 以后&#xff0c;每次開機都會提示請求通訊錄權限&#xff0c;把設置里的通訊錄關掉也沒用&#xff0c;每次都提示又非常煩人&#xff0c;這里把解決方法記錄一下。 依次打開 應用程序 - Alfred 3.app - 右鍵顯示包內容 - Contents - Frameworks - Alfred Framew…

【轉】DCOM遠程調用權限設置

原文&#xff1a;https://blog.csdn.net/ervinsas/article/details/36424127 最近幾天被搞得焦頭爛額&#xff0c;由于DCOM客戶端程序是在32位系統下開發的&#xff0c;調試時DCOM服務端也是安裝在同一臺機器上&#xff0c;所有過程一直還算順利。可這次項目實施的時候&#xf…

opencv:邊緣檢測之Laplacian算子思想及實現

Laplacian算子邊緣檢測的來源 在邊緣部分求取一階導數&#xff0c;你會看到極值的出現&#xff1a; 如果在邊緣部分求二階導數會出現什么情況? 從上例中我們可以推論檢測邊緣可以通過定位梯度值大于鄰域的相素的方法找到(或者推廣到大 于一個閥值). 從以上分析中&#xff0c…

使用機器學習預測天氣_如何使用機器學習預測著陸

使用機器學習預測天氣Based on every NFL play from 2009–2017根據2009-2017年每場NFL比賽 Ah, yes. The times, they are changin’. The leaves are beginning to fall, the weather is slowly starting to cool down (unless you’re where I’m at in LA, where it’s on…

laravel 導出插件

轉發&#xff1a;https://blog.csdn.net/gu_wen_jie/article/details/79296470 版本&#xff1a;laravel5 php 5.6 安裝步驟&#xff1a; 一、安裝插件 ①、首先在Laravel項目根目錄下使用Composer安裝依賴&#xff1a; composer require "maatwebsite/excel:~2.1.0"…

國外 廣告牌_廣告牌下一首流行歌曲的分析和預測,第1部分

國外 廣告牌Using Spotify and Billboard’s data to understand what makes a song a hit.使用Spotify和Billboard的數據來了解歌曲的流行。 Thousands of songs are released every year around the world. Some are very successful in the music industry; others less so…

Jmeter測試普通java類說明

概述 Apache JMeter是Apache組織開發的基于Java的壓力測試工具。本文檔主要描述用Jmeter工具對基于Dubbo、Zookeeper框架的Cassandra接口、區塊鏈接口進行壓力測試的一些說明&#xff0c;為以后類似接口的測試提供參考。 環境部署 1、 下載Jmeter工具apache-jmeter-3.3.zip&am…

opencv:Canny邊緣檢測算法思想及實現

Canny邊緣檢測算法背景 求邊緣幅度的算法&#xff1a; 一階導數&#xff1a;sobel、Roberts、prewitt等算子 二階導數&#xff1a;Laplacian、Canny算子 Canny算子效果比其他的都要好&#xff0c;但是實現起來有點麻煩 Canny邊緣檢測算法的優勢&#xff1a; Canny是目前最優…

關于outlook簽名圖片大小的說明

96 dpiwidth576 height114轉載于:https://blog.51cto.com/lch54734/2298115

opencv:畸變矯正:透視變換算法的思想與實現

畸變矯正 注意&#xff1a;雖然能夠成功矯正但是也會損失了部分圖像&#xff01; 透視變換(Perspective Transformation) 概念&#xff1a; 透視變換是將圖片投影到一個新的視平面(Viewing Plane)&#xff0c;也稱作投影映射(Projective Mapping)。 我們常說的仿射變換是透視…

數據多重共線性_多重共線性對您的數據科學項目的影響比您所知道的要多

數據多重共線性Multicollinearity is likely far down on a mental list of things to check for, if it is on a list at all. This does, however, appear almost always in real-life datasets, and it’s important to be aware of how to address it.多重共線性可能根本不…

PHP工廠模式計算面積與周長

<?phpinterface InterfaceShape{ function getArea(); function getCircumference();}/** * 矩形 */class Rectangle implements InterfaceShape{ private $width; private $height; public function __construct($width,$height){ $this->width$…

K-Means聚類算法思想及實現

K-Means聚類概念&#xff1a; K-Means聚類是最常用的聚類算法&#xff0c;最初起源于信號處理&#xff0c;其目標是將數據點劃分為K個類簇&#xff0c; 找到每個簇的中心并使其度量最小化。 該算法的最大優點是簡單、便于理解&#xff0c;運算速度較快&#xff0c;缺點是只能應…

(2.1)DDL增強功能-數據類型、同義詞、分區表

1.數據類型 &#xff08;1&#xff09;常用數據類型  1.整數類型 int 存儲范圍是-2,147,483,648到2,147,483,647之間的整數&#xff0c;主鍵列常設置此類型。 &#xff08;每個數值占用 4字節&#xff09; smallint 存儲范圍是-32,768 到 32,767 之間的整數&#xff0c;用…

充分利用昂貴的分析

By Noor Malik努爾馬利克(Noor Malik) Let’s say you write a query in Deephaven which performs a lengthy and expensive analysis, resulting in a live table. For example, in a previous project, I wrote a query which pulled data from an RSS feed to create a li…