Setting Up my automated (semi) ML-based trading strategy

Neto Figueira
5 min readFeb 6, 2024

Now that I have an ML model to trade stocks (you can read about it here), is time to build an infrastructure to start the trades in real life. for now, it's going to be a “partial” systematic approach to trading, and the reason is the lack of a good API available in Brazilian markets to trade stocks, as far as I know. Further, if things go well, I’ll set up a Meta Trader server to fully automate the trades, but for now, let’s keep things simple. So, this project will work with two main scripts:

  • Signal Processor: this script will run once a day and evaluate the prediction at the closing market price (since my model was trained only with close prices, avoiding look-ahead bias here) and report the prediction outcomes
  • Signal Monitor: this script will monitor the open trades to check if they reached a stop gain or loss, or timed out.

I’ll use an S3 bucket on AWS to store a CSV file with the records of trades, and the scripts will run on an ec2 instance with a cronjob to trigger them every day.

AWS Infrastructure

It's a really simple structure here, I’ve set up an s3 bucket, called “trading_signals”

Signal Processor

the signal_processor script connects to the bucket using boto3 to read the CSV file as follows:

aws_access_key_id = os.getenv('AWS_ACCESS_KEY')
aws_secret_access_key = os.getenv('AWS_SECRET_ACCESS_KEY')
bucket_name = os.getenv('bucket_name')
region = 'us-east-1'
file_key = 'trading_signals.csv'

s3_client = boto3.client('s3', aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key, region_name=region)

# Specify data types for the columns
dtype = {'entry_price': float, 'upper_barrier': float, 'stop_loss': float, 'volatility': float, 'prob': float, 'n_days': int}
# Read the CSV file directly into a Pandas DataFrame
with open(f's3://{bucket_name}/{file_key}', 'r', transport_params={'client': boto3.client('s3', aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key, region_name=region)}) as file:
signals_df = pd.read_csv(file, dtype=dtype)

and then checks if it has stocks available to trade. I’m currently trading at most 3 stocks, VALE3, PETR4, and PRIO3. First, the code checks if there are open trades of the stocks:

stocks_features = {'PRIO3.SA':['BZ=F', 'CL=F',], 'VALE3.SA':['HG=F', 'TIO=F'], 'PETR4.SA':['BZ=F', 'CL=F']}

stocks_available = list(set(stocks_features.keys()) - set(signals_df[signals_df.status == 'open']['stock_symbol'].to_list()))
print(stocks_available)

if len(stocks_available) == 0:

logging.info('Nothing to trade today')
quit()

If it has stocks available to trade, it's going to fit the Xgboost algorithm to all data available besides the last closing price, and make a prediction:

for stock_symbol in stocks_available:

print(f"start {stock_symbol} analysis")

features_list = stocks_features[stock_symbol]
download_data = features_list.copy()
download_data.append(stock_symbol)
prices_df = yf.download(download_data, period='max',progress=False)
raw_data = prices_df['Adj Close'].dropna()
raw_data.rename(columns={stock_symbol:'STOCK_PRICE'}, inplace=True)

# initialize xgboost
xgboost_params = {
'objective': 'binary:logistic', # Binary classification
'eval_metric': 'auc', # AUC as the evaluation metric
'seed': 42, # Random seed
'verbosity': 1, # Verbose mode (0 for silent, 1 for printing messages)
}

xgboost_model = xgb.XGBClassifier(**xgboost_params)

pred, pred_proba,_ = evaluate_signal(raw_data, xgboost_model, 50)
logging.info(f'Signal type generated: {pred} with probability {pred_proba[0]} for {stock_symbol}')

with the prediction available, we can evaluate if it is an entry signal and generate the necessary values to execute and monitor the trade:

    #pred = 1
if pred == 1:

text = f"ENTRY SIGNAL SPOTED! {today} - prob: {pred_proba}"
print(pred_proba)
proba = pred_proba[0][1]
print(proba)
last_price, vol_df = get_last_price(stock_symbol)
vol = vol_df.loc[today_str]
# upper barrier and stop loss calculation
upper_barrier = last_price + (last_price * vol)*threshold
stop_loss = last_price - (vol*last_price)*threshold

columns = ["stock_symbol", "entry_date", "entry_price", "upper_barrier", "stop_loss", "status", 'volatility', "prob", 'n_days']
res = [[stock_symbol,today_str,last_price,upper_barrier, stop_loss,'open_signal', vol, proba, 0]]
aux = pd.DataFrame(res, columns=columns)
print(aux)

"""
with open('trading_signals.csv', mode='a', newline='') as file:
csv_writer = csv.writer(file)
csv_writer.writerow(res[0])

"""
# Append data to the CSV file in S3
signals_df = pd.concat([signals_df, aux])

else:
text = f"NO TRADES FOR TODAY {today} - prob: {pred_proba[0]} for {stock_symbol}"
print(text)

if pred == 1, it's going to append the data to the signals_df and write it again to s3 bucket, and also, send an email to my account to let me know the situation of the daily predictions:

send_email(password=password, text=text)

# Write the Pandas DataFrame to the CSV file in S3
with open(f's3://{bucket_name}/{file_key}', 'w', newline='', transport_params={'client': boto3.client('s3', aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key, region_name=region)}) as file:
signals_df.to_csv(file, index=False)

here’s the send_email function:

def send_email(text, password):
senha = password
now = datetime.datetime.utcnow().replace(tzinfo=pytz.utc)
now = now.replace(tzinfo=None)

# environment variables
smtphost = 'smtp.gmail.com'
smtpport = 587
username = 'xxxxxxxxxxy@gmail.com'
password = senha
# MAIL_FROM: Alias Name <alias@domain.com> (Optional)
mail_to = 'xxxxxxxn@gmail.com'
subject = 'TRADING BOT RUN'
mail_from = username
reply_to = mail_to



msg = MIMEMultipart('alternative')
msg['Subject'] = subject
msg['From'] = mail_from
msg['To'] = mail_to
htmlbody = MIMEText(text, 'html')
msg.attach(htmlbody)

#message = """From: %s\nTo: %s\nReply-To: %s\nSubject: %s\n\n%s""" % (mail_from, mail_to, reply_to, subject, body)
try:

server = smtplib.SMTP(smtphost, smtpport)
server.ehlo()
server.starttls()
server.login(username, password)
server.sendmail(mail_from, mail_to, msg.as_bytes())
server.close()
print("!")
return True

except Exception as ex:
print(ex)

In my mailbox, I can check the results:

Signal Monitor

The “signal_monitor.py” script will handle the open trades, monitoring the stock prices to see if it reached a stop gain or loss, or time out. It reads the data from the bucket, filters the open trades, and calculates the needed values to compare with the current stock price.

open_trades = signals_df[signals_df.status == 'open_signal']

for i, row in open_trades.iterrows():

ticker = row['stock_symbol']
entry_date = row['entry_date']
entry_price = row['entry_price']
vol = row['volatility']
# days passed after open trade
passed_date = datetime.datetime.strptime(entry_date, '%Y-%m-%d')
today = datetime.datetime.now()
today_str = today.strftime("%Y-%m-%d")
days_difference = (today - passed_date).days
print(ticker,f'days passed {days_difference}')

#function to get last price of the current stock
last_price = get_last_price(ticker)
# upper barrier and stop loss calculation
upper_barrier = entry_price + (entry_price * vol)
stop_loss = entry_price - vol*entry_price

then we compare the last price with upper_barrier and stop_loss parameters and updates the trading_signals.csv file as needed:

   if days_difference <= 10:

if last_price >= upper_barrier:
trade_type = 'reach_barrier'

elif last_price <= stop_loss:
trade_type = 'stop_loss'

else:
trade_type = 'open'

else:
trade_type = 'timeout'
# send close trade message

text = f'trade for {ticker} starting {entry_date} reached {trade_type}'
signals_df.loc[i, 'n_days'] = days_difference
signals_df.loc[i, 'status'] = trade_type


send_email(text=text, password=password)

# Write the Pandas DataFrame to the CSV file in S3
with open(f's3://{bucket_name}/{file_key}', 'w', newline='', transport_params={'client': boto3.client('s3', aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key, region_name=region)}) as file:
signals_df.to_csv(file, index=False)

To finish the last piece of my automated trading platform, I’ve set up an EC2 instance, and configured the following cronjob:30 17 * * 1-5 /usr/bin/python3 /home/ubuntu/signal_monitor.py

and now I’ll and now I receive the updates in my email as soon as I get a new entry order. Time to lose money for real!

--

--