Scraping Freecycle with AWS - Part 2, Authentication
Creating a serverless application to scrape Freecycle and send me a notification when something is posted.



Part 2
The Lambda
So now we need to update the lambda to actually call the endpoint we need. We are going to develop this locally using tests and then deploy it to AWS.
We built the boilerplate in part 1 so now we can have a think about the actual code that needs to run.
We need to:
- Make an http request to freecycle
- authenticate
- read the list of posts up to the latest one we have seen (based on our recorded date)
- go to the next page if we haven't read them all yet
- record the date of the latest post
- send a notification per new post if there are any new posts.
- go back to sleep.
Make an http request.
So node.js has an https module which is the only one that actually works. Neither axios or node-fetch work. Possibly due to some kind of tls fingerprinting. So we'll use the native module.
The address for freecycle is https://www.freecycle.org/home/dashboard
Let's update the lambda to make a request to freecycle and log the response.
const getOptions = {
hostname: 'www.freecycle.org',
port: 443,
path: '/home/dashboard',
method: 'GET',
headers: {
'Content-Type': 'text/html',
},
}
const getLatestPosts = async () =>
new Promise((resolve, reject) => {
const req = https.get(getOptions, (res) => {
let data = ''
res.on('data', (d) => {
data += d
})
res.on('end', () => {
resolve(data)
})
})
req.on('error', (error) => {
console.error(error)
reject({ status: 500, body: 'error' })
})
})
const handler = async () => {
console.log(await getLatestPosts())
return {
statusCode: 200,
body: 'hello world',
}
}
export { handler }
You can see here that I'm using a promise to wrap the callback and resolve or reject on the event emitted that is returned from the https.get
function.
So that seems to work, but we aren't logged in so let's create a function that authenticates with the correct endpoint and the correct credentials.
I'm going to start using environmental variables to store the username and password, so I need to install dotenv for this. but I'm only going to use it locally so I'll install it as a dev dependency.
npm i -D dotenv
now I'll create an .env
file in the root of the project and add the following:
VITE_USERNAME=steve
VITE_PASSWORD=<secret>
I'm using vitest
so you have to prefix with VITE_
to allow us to use the variables in the tests.
Here is the code to implement the authentication:
import fs from 'fs'
import https from 'https'
const getOptions = {
hostname: 'www.freecycle.org',
port: 443,
path: '/home/dashboard',
method: 'GET',
headers: {
'Content-Type': 'text/html',
},
}
const postOptions = {
hostname: 'www.freecycle.org',
port: 443,
path: '/login',
method: 'POST',
headers: {
'Content-Type': 'application/x-www-form-urlencoded',
},
}
const getLatestPosts = async (cookie: string) =>
new Promise((resolve, reject) => {
const req = https.get(
{ ...getOptions, headers: { ...getOptions.headers, Cookie: cookie } },
(res) => {
let data = ''
res.on('data', (d) => {
data += d
})
res.on('end', () => {
resolve(data)
})
}
)
req.on('error', (error) => {
console.error(error)
reject({ status: 500, body: 'error' })
})
})
const getLoginCookie = async (): Promise<string> =>
new Promise((resolve, reject) => {
const req = https.request(postOptions, (res) => {
res.on('data', (d) => d)
res.on('end', () => {
resolve(res?.headers?.['set-cookie']?.[0] as string)
})
})
req.on('error', (error) => {
console.error(error)
reject({ status: 500, body: 'error' })
})
const formData = new URLSearchParams()
formData.append('user', `${process.env['USERNAME']}`)
formData.append('password', `${process.env['PASSWORD']}`)
req.write(formData.toString())
req.end()
})
const handler = async () => {
const cookie = await getLoginCookie()
const latestPosts = await getLatestPosts(cookie.split(';')[0] as string)
fs.writeFileSync('dist/posts.html', latestPosts as string) // write it out to a file so we can see it
return { statusCode: 200, body: 'hello world' }
}
export { handler }
So lets make the dist
folder:
mkdir dist
If we run the test we should see some results in the dist/posts.html
file.
npm test
So now we can set this cookie and make a request to the dashboard page. We should get a list of posts back.
Continue to part 3 - how I parsed the data using unifiedjs