Creating a Scalping Strategy in Python with a 74% Win Rate

Comprehensive backtesting of an unconventional trading strategy

Nikhil Adithyan
DataDrivenInvestor

--

Photo by Aidan Hancock on Unsplash

Introduction

Recently, I took a break from backtesting the usual trading strategies built using technical indicators and went on a search spree to explore some unconventional strategies. That’s when I came across the concept of Scalping trading and how it’s used by traders in the market.

I was fascinated by it and tried my hands on the strategy. I came up with my own Scalping strategy and backtested the same in Python, and the results were pretty interesting.

In this article, I’m going to explain the strategy I implemented with Python and how you can develop one too. So we’ll first get some background about our trading strategy and then we’ll proceed to the coding part where we’ll obtain the historical and intraday data using FinancialModelingPrep’s (FMP) APIs and backtest our strategy in Python.

Without further ado, let’s dive into the article!

Our Scalping Trading Strategy

Before diving into the mechanics of our trading strategy, it’s necessary to first have some good knowledge about the concept of Scalping Trading.

Scalping Trading

Scalping trading is an unconventional approach to trading where traders aim to profit from small price movements.

For example, a trader who follows scalping trading would buy a huge amount of, let’s say, Apple (AAPL) stocks at $170 and aim to sell the stock at a very small price increase like $170.5 or $171.

The reason behind selling the stocks rapidly at a small price change is the massive amount of capital involved in each trade. A sizable profit can be seen with scalping trading only when it’s done with a huge capital.

The Trading Strategy

Now that we have a good understanding of what scalping trading is, let’s dig deep into our trading strategy.

So the basic idea of scalping trading is to profit from small price changes. Obviously, we can’t use this as our strategy as it’s way too blunt but we can do is to keep this as the foundation of our trading strategy and we need to build upon it to create an efficient scalping strategy.

The following are the mechanics of our trading strategy:

We enter the market if: The market opens at an increased price of 1% from the previous day’s close.

We exit the market when: There is a 1% increase in the stock’s price from the buying price which is the day’s opening price. If the stock fails to reach the 1% increase in its price, we exit by the end of the trading day at the closing price.

The goal here is not to create a complicated strategy with the various entry and exit conditions but to create a simple one to understand the nature of scalping trading. With that being said, let’s proceed to the coding part!

Importing Packages

The first and foremost step is to import all the required packages into our Python environment. In this article, we’ll be using four packages which are:

  • Pandas — for data formatting, clearing, manipulating, and other related purposes
  • Matplotlib — for creating charts and different kinds of visualizations
  • Requests — for making API calls in order to extract data
  • Termcolor — to customize the standard output shown in Jupyter notebook
  • Math — for various mathematical functions and operations
  • NumPy — for numerical and high-level mathematical functions

The following code imports all the above-mentioned packages into our Python environment:

# IMPORTING PACKAGES

import requests
import pandas as pd
import matplotlib.pyplot as plt
from termcolor import colored as cl
import numpy as np
import math

If you haven’t installed any of the imported packages, make sure to do so using the pip command in your terminal.

Extracting Historical Data

Obtaining the historical data of the stock is very vital for the backtesting process. For data accuracy and reliability, we will use FinancialModelingPrep’s (FMP) historical data endpoint which allows the extraction of end-of-day data for any specific stock. We are going to backtest our scalping strategy on Tesla’s stock and the following code extracts the historical data of the same from the start of 2014:

# EXTRACTING HISTORICAL DATA

api_key = 'YOUR API KEY'
tsla_json = requests.get(f'https://financialmodelingprep.com/api/v3/historical-price-full/TSLA?from=2014-01-01&apikey={api_key}').json()

tsla_df = pd.DataFrame(tsla_json['historical']).drop('label', axis = 1)
tsla_df = tsla_df.set_index('date')
tsla_df = tsla_df.iloc[::-1]
tsla_df.index = pd.to_datetime(tsla_df.index)

tsla_df

The code is very simple. We are first storing the API key in the api_key variable. Make sure to replace YOUR API KEY with your secret API key which you can obtain once you create an FMP developer account.

Then, using the get function provided by the Requests package, we are making an API call to get the historical data of TSLA. Finally, we are converting the extracted JSON response into a workable Pandas dataframe along with some data manipulation and this is the output:

TSLA Historical Data Extracted using FMP’s API (Image by Author)

One thing that I truly love about the API response given by FMP’s historical data endpoint is the sheer amount of additional data that comes with it. Apart from the OHLC data, you get VWAP, Change %, and much more which can be really useful in a of scenarios.

Calculating % Change and Filtering

According to the trading strategy, we enter the market if the current day’s opening price is 1% higher than the previous day’s closing price. For this, we first need to calculate the percentage change in the prices with the help of the obtained historical data. The following code does the same:

# CALCULATING % CHANGE

tsla_df['pclose_open_pc'] = np.nan

for i in range(1, len(tsla_df)):
diff, avg = (tsla_df.close[i-1] - tsla_df.open[i]) , (tsla_df.close[i-1] + tsla_df.open[i])/2
pct_change = (diff / avg)*100
tsla_df['pclose_open_pc'][i] = pct_change

tsla_df = tsla_df.dropna().drop(['change', 'changePercent', 'changeOverTime'], axis = 1)
tsla_df = tsla_df[tsla_df.pclose_open_pc > 1]

tsla_df

The code might look a little fuzzy with all the for-loop and stuff but it’s actually quite simple. Let me break it down.

First, we are creating a new column called pclose_open_pc to store the percentage change values. Then comes the for-loop. The main idea behind this for-loop is to replicate the functioning of the pct_change() function provided by Pandas.

The only change we are making here with the help of the for-loop is to calculate the percentage change between two different variables. i.e., the current day’s opening and the previous day’s closing which is not possible with pct_change() which allows percentage change calculation between the current value and the old value for a single variable.

After that, we drop some columns and we are slicing the dataframe using a condition that selects rows where the percentage change is greater than 1, i.e., the current day’s opening price is higher than the previous day’s closing price by 1%.

This is the final dataframe after all the processing:

Filtered TSLA historical data (Image by Author)

Now that we have prepared the data, it’s time for us to actually build and test our scalping trading strategy.

Backtesting the Strategy

We have arrived at one of the most important and interesting steps in this article. Now that we know the ins and outs of our trading strategy let’s build it and backtest it in Python. We are going to follow a very basic and straightforward system of backtesting for the sake of simplicity. The following code backtests the scalping strategy:

investment = 100000
equity = investment
earning = 0

earnings_record = []

for i in range(len(tsla_df)):

# EXTRACTING INTRADAY DATA
date = str(tsla_df.index[i])[:10]
intra_json = requests.get(f'https://financialmodelingprep.com/api/v3/historical-chart/1min/TSLA?from={date}&to={date}&apikey={api_key}').json()
intra_df = pd.DataFrame(intra_json)
intra_df = intra_df.set_index(pd.to_datetime(intra_df.date)).iloc[::-1]

# ENTERING POSITION
open_p = tsla_df.iloc[i].open
no_of_shares = math.floor(equity/open_p)
equity -= (no_of_shares * open_p)

# EXITING POSITION
intra_df['p_change'] = np.nan

for i in range(len(intra_df)):
diff, avg = (intra_df.close[i] - open_p), (intra_df.close[i] + open_p)/2
pct_change = (diff / avg)*100
intra_df['p_change'][i] = pct_change
intra_df = intra_df.dropna()
greater_1 = intra_df[intra_df.p_change > 1]

if len(greater_1) > 0:
sell_price = greater_1.iloc[0].close
equity += (no_of_shares * sell_price)
else:
sell_price = intra_df.iloc[-1].close
equity += (no_of_shares * sell_price)

# CALCULATING TRADE EARNINGS
investment += earning
earning = round(equity-investment, 2)
earnings_record.append(earning)

if earning > 0:
print(cl('PROFIT:', color = 'green', attrs = ['bold']), f'Earning on {date}: ${earning}; Bought ', cl(f'{no_of_shares}', attrs = ['bold']), 'stocks at ', cl(f'${open_p}', attrs = ['bold']), 'and Sold at ', cl(f'${sell_price}', attrs = ['bold']))
else:
print(cl('LOSS:', color = 'red', attrs = ['bold']), f'Loss on {date}: ${earning}; Bought ', cl(f'{no_of_shares}', attrs = ['bold']), 'stocks at ', cl(f'${open_p}', attrs = ['bold']), 'and Sold at ', cl(f'${sell_price}', attrs = ['bold']))

Instead of diving deep into each and every line of the code, I’ll try to give a gist of this backtesting system’s code.

Basically, the code can be separated into four sections (as can be seen in the code’s comments):

  • The first one is the extraction of the intraday data using FMP’s intraday data endpoint for days when we enter the market.
  • The second section is where we enter our position by buying the stocks at the day’s opening price.
  • The third section is the code for exiting our position where we first calculate the percentage change in prices and then we square off the position as soon as there is a 1% increase in price.
  • The fourth section is dedicated to calculating the earnings of each trade and printing the results in the output terminal.

The program made a lot of trades, and though it’s not possible to show one of them, here’s a glimpse of the trades that are executed:

Trades generated by the program (Image by Author)

The output includes all necessary details like the number of stocks, buying price, selling price, date, and earnings. Though there are some losing trades, the majority are profitable trades with reasonable earnings.

Strategy Returns Analysis & Evaluation

In this section, we are going to dive deep into the performance of the strategy. Let’s start off with the basic metrics of strategy returns and ROI. The following code calculates the total earnings and ROI of the strategy:

# STRATEGY RETURNS

print(cl(f'TSLA BACKTESTING RESULTS:', attrs = ['bold']))
print(' ')

strategy_earning = round(equity - 100000, 2)
roi = round(strategy_earning / 100000 * 100, 2)

print(cl(f'EARNING: ${strategy_earning} ; ROI: {roi}%', attrs = ['bold']))

The code is very simple. We are just implementing the mathematical formulas for calculating total earnings and Return On Investment (ROI) to arrive at the final figures and this is the final output:

Backtesting results (Image by Author)

So our scalping trading strategy has generated a total earning of $110K with an ROI of 110% over the course of ten years (2014–2024). That’s not bad. Though the results are not something overwhelming, it’s still considerably good.

Now let’s evaluate our trading strategy using some basic metrics. But in order to calculate those metrics, let’s first create and process the data for that:

earnings_df = pd.DataFrame(columns = ['date', 'earning'])
earnings_df.date = tsla_df.index
earnings_df.earning = earnings_record

earnings_df.tail()

The main purpose of this code is to create a dataframe that contains the details of the earnings generated by the strategy on each trading day. With the help of the earnings_record list we created in the backtesting code to record the earnings data, we can easily create a Pandas dataframe out of it and that’s exactly what we’ve done in the code. This is the final output:

Earnings data (Image by Author)

Now that we have the required data, we can now proceed to calculate the necessary metrics for evaluating our trading strategy. Before that, here’s a list of the metrics which we’re going to use for evaluation:

  • Max loss: The earnings of the worst trade
  • Max profit: The earnings of the best trade
  • Total trades: Sum of trades generated by the strategy
  • Win rate: The probability of the success of the strategy
  • Average Trades per Month and Average Earnings per Month

The following code calculates all the above-discussed metrics:

max_loss = earnings_df.earning.min()
max_profit = earnings_df.earning.max()

no_of_wins = len(earnings_df.iloc[np.where(earnings_df.earning > 0)[0]])
no_of_losses = len(earnings_df.iloc[np.where(earnings_df.earning < 0)[0]])
no_of_trades = no_of_wins+no_of_losses
win_rate = (no_of_wins/(no_of_wins + no_of_losses))*100

print(cl('MAX LOSS:', color = 'red', attrs = ['bold']), f'${max_loss};',
cl('MAX PROFIT:', color = 'green', attrs = ['bold']), f'{max_profit};',
cl('TOTAL TRADES:', attrs = ['bold']), f'{no_of_trades};',
cl('WIN RATE:', attrs = ['bold']), f'{round(win_rate)}%;',
cl('AVG. TRADES/MONTH:', attrs = ['bold']), f'{round(no_of_trades/120)};',
cl('AVG. EARNING/MONTH:', attrs = ['bold']), f'${round(strategy_earning/120)}'
)

plt.style.use('ggplot')

earnings_df.earning.hist()
plt.title('Earnings Distribution')
plt.show()

earnings_df = earnings_df.set_index('date')

earnings_df.earning.cumsum().plot()
plt.title('Strategy Cumulative Returns')
plt.show()

Apart from calculating the metrics, there are some visualizations involved in the code which we’ll discuss in a minute. This is the output:

Strategy Performance (Images by Author)

Let’s first talk about the metrics. The Max loss is $15.1k while the Max profit is $5K. The thing is, it’s always better to have the absolute value of Max loss lesser than Max profit as it reduces the risk of the strategy.

The total number of trades generated by the strategy is 501 over the course of ten years. The figure is not overwhelming which is a good thing. The average number of trades executed in a month is 4, which is again, a reasonable amount of trades. The average return per month is $923 which is a solid number.

The most important metric of all is the Win rate. The win rate of our strategy is 74%. This means that our strategy has a 74% chance of executing a profitable trade which is incredible. Strategies with a win rate greater than 50% are considered to be effective and less risky. But still, this number doesn't mean anything if there is no proper risk management system in place.

Coming to the charts, there are two of them which can be seen in the above-represented output. The first one is a histogram which shows the earnings distribution of our strategy. It can be seen that most of the trades generated around $1500-$2500.

The second chart is a line chart that shows the cumulative returns of our trading strategy. The first thing that can be noticed is the unstable movement of the line. This indicates that the strategy’s earnings swings a lot, meaning that our strategy is volatile and risky in some cases. It’s always better to avoid those sudden dips which can be seen in the chart.

Buy/Hold Returns Comparison

A good trading strategy should not just be able to generate profitable returns but must be efficient enough to outperform the buy/hold strategy. To those who don’t know what the buy/hold strategy is, it’s a strategy where the trader buys and holds the stock no matter what the circumstance is for a longer period.

If our strategy beats the buy/hold strategy, we can confidently say that we came up with a good trading strategy that is almost ready to be deployed in the real world. Whereas, if it fails to do so, we have to make a pretty good amount of changes to the strategy.

The following code implements the buy/hold strategy and calculates the returns:

api_key = 'YOUR API KEY'
json = requests.get(f'https://financialmodelingprep.com/api/v3/historical-price-full/TSLA?from=2014-01-01&apikey={api_key}').json()

df = pd.DataFrame(json['historical']).drop('label', axis = 1)
df = df.set_index('date')
df = df.iloc[::-1]
df.index = pd.to_datetime(df.index)

tsla_roi = round(list(df['close'].pct_change().cumsum())[-1],4)*100
print(cl(f'BUY/HOLD STRATEGY ROI: {round(tsla_roi,2)}%', attrs = ['bold']))

In the above code, we are again extracting the historical data of Tesla using FMP’s historical data endpoint since we made a number of changes to the previously extracted data. Following some data manipulations, we used simple mathematics to calculate the returns and this is the final output:

Buy/Hold Strategy ROI (Image by Author)

After comparing the results of the buy/hold strategy and our scalping trading strategy, the buy/hold strategy outperformed ours with a difference of 333% in ROI. What does this mean? This means we have to work a lot on our scalping strategy before taking it to the market.

Conclusion

Although we went through an extensive process of coming up with the trading strategy, obtaining the historical and intraday data using FMP’s APIs, and backtesting the strategy, we still failed to outperform the buy/hold strategy. But the aim of this article is not to showcase any money-making methods with the help of these strategies but to give a gentle introduction to scalping trading.

If you really want to turn this strategy into a highly profitable one and deploy it to the market, there are a few improvements that you can make to achieve that. The first is to tune the strategy parameters. Try playing around with the entry and exit conditions and test which is the most optimal one for the trading strategy. The next is to put a proper risk management system in place. This is something we didn’t cover in this article but it’s extremely crucial especially when you want to take the strategy to the real-world market.

With that being said, you’ve reached the end of the article. Hope you learned something new and useful today. If you have any suggestions to improve the trading strategy, kindly let me know in the comments. Thank you very much for your time.

--

--