Here’s a story on how I’ve utilized what I learned in my coding bootcamp with my passion for gardening, by building a Craigslist scraping bot, called the planter-ropot.

With state parks closed and Shelter in Place orders earlier this year due to COVID-19, it was time to get creative to enjoy nature safely. I’ve always wanted to build my own garden and thought this would be the perfect opportunity, given I was no longer commuting for a few months. I love to frequent and support my local farmers’ markets, always admiring the beautiful and often exotic varieties of produce you can’t find at your everyday grocery store.

I always admired large wooden planters as they allow for regulation of soil, water, and mobility. More importantly, I loved that I could potentially plant a variety of vegetables in each box! Purchasing a planter from a hardware store was out of my budget, as was buying direct from someone who builds planter boxes. So I turned to Craigslist.

My search began by checking Craigslist in the morning, mid-day, and evening, every single day. During this time, many were moving out of the Bay Area and looking to get rid of their garden supplies ASAP, and at no cost! Most raised beds/planter boxes get claimed within 30-60 minutes of e-mailing the poster so time is of the essence.

Having just graduated from coding bootcamp two months ago, I knew it was time to build something to alert me in real-time for planters in my neighborhood. In order to do this, I would need to write a program to run a search on Craigslist and store the results at regular intervals.

At a high level, the program, written in Python, involves the following steps:

  1. Scrape Craigslist data using Requests library
  2. Parse scraped HTML using Beautiful Soup library
  3. Use Slack API to post to Slack and Twilio API to send SMS
  4. Store most recent listing datetime
  5. Only scrape listings posted after most recent listing datetime

The program runs continuously, sleeping for ten minutes between runs. This was my first time deploying any application so it was definitely a learning experience for me. Firstly, all Heroku applications run in a collection of Linux containers called dynamometer, or dynos. Dynos power Heroku apps. After writing and debugging my python code, I couldn’t figure out how to deploy my application on Heroku for the longest time because most resources use the following line in the Procfile:

web: python scraper.py 

After countless failed attempts at deploying with this line in my Procfile, I reached out to one of my mentors from Hackbright Academy for help. I found out my Slackbot actually runs on worker dynos. The actual change was so simple! All I had to do was update my Procfile to:

worker: python scraper.py

and my Slackbot was successfully deployed.

Next, I’ll go into how I scraped Craigslist using the Requests library.

First, the bot sends an HTTP GET request to Craigslist with a URL search for “planters”. The GET request retrieves information such as text and media from the website which is received in the form of HTML.

The left side of the screenshot shows what the user sees, and the right(“Inspect Element”) shows what the
computer “sees”.

Craigslist doesn’t have an official API, so I began scraping the listings using the Requests library, and BeautifulSoup to parse the HTML.

import requests
from bs4 import BeautifulSoup as b_s

def craigslist_soup(region, term, last_scrape):
url = "https://{region}.craigslist.org/search/sss?query={term}".format(
region=region, term=term
)
response = requests.get(url=url)
soup = b_s(response.content, 'html.parser')

After parsing the scraped HTML, the listings are sent to Slack using the below function. The most recent listing datetime is written to a CSV using a separate function (full code viewable in repo).

def post_to_slack(result_listings):
client = WebClient(SLACK_TOKEN)
for item in result_listings:
sliced_description = item['description']
sliced_description = sliced_description[:100] + '…'
desc = f" {item['neighborhood_text']} | {item['created_at']} | {item['price']} | {item['title_text']} | {item['url']} | {sliced_description} | {item['jpg']} | {item['cl_id']} "
response = client.chat_postMessage(channel=SLACK_CHANNEL, text=desc)
print("End scrape {}: Got {} results".format(datetime.now(), len(result_listings)))

But before this actually works, a Slack app also needs to be configured by going to Your Apps. Once the app is created, it is listed in Your Apps like below.

Clicking on the respective App Name will lead you to Basic Information, where you can customize the Slackbot. Its API can be found under Add features and functionality –> Permissions. Adding chat:write and links:write under Bot OAuth Token scopes is required in order for your Slackbot to post.

I store the API token locally in my secrets.sh file, where variables are kept secret. However, upon deploying Slackbot to Heroku, ensure this API is stored using the following command in BASH:

heroku config:set SLACK_API_TOKEN='INSERT-TOKEN-BETWEEN-THESE-SINGLE-QUOTES'

Once the application is successfully deployed, Craigslist listings will successfully appear in the respective Slack channel like below.

Note: Per my conversation with Slack support, there is a bug where Slackbot posts with unfurled images show as “edited”. Their team is currently working on getting this fixed.

If the listing is located in the same city as the user, a text message of the listing will be sent.

Text message being sent to the user if listing is in user’s neighborhood. Powered by Twilio’s API!

Between cloud deployment, API configuration, web scraping, and scheduling, this was such a fun and rewarding project. Just as importantly, I’ve actually found several planters with my planter-ropot!

A few of the other amazing planters I’ve found using planter-ropot! Lovely handpainted and weatherproof sign by my dear friend Belinda Hui.



A link to my Github repository (or should I say repotsitory) can be found here. I’ve since adapted this to find kitchen scales, in a channel #Scales-Funnel. This project was quite the learning journey for me, and I especially want to thank Vik Paruchuri’s Apartment Finding Slackbot.

Thanks so much for reading about my planter-ropot! Here’s a fun video that was inspired by my Slackbot. Hope you enjoy!

Posted by:chantel

One thought on “Scraping Craigslist to find my perfect planter with Slackbot and Twilio

  1. Wow! This is such a helpful and efficient tool! I hate refreshing Craigslist over and over again and searching through it. This tool makes it so easy to find good stuff near me! Glad you were able to find free planters & gardening tools! Also, love your puns- repotsitory! So good. 😂

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s