P O S T M O D E R N

Solving Advent of Cyber 2023 Day 2 using Ruby

adventofcyber, adventofcyber2023, ctf, ruby, tryhackme, writeup

This year I decided to try my hand at the Advent of Cyber challenges.

The Day 2 challenge involves Data Science. We are given a Jupyter Notebook file containing a table of log data showing ports scan events. Now the challenge teaches you how to use Jupyter Notebook and Python, but we’re not going to solve it using Python. We are going to solve it using only Ruby!

While Python is very popular in the Data Science field, you can do Data Science with Ruby. Ruby standard library comes with many useful methods, such as map, select, group_by, group_by, which allow you to slice and dice large datasets.

First, we will need to liberate the data from the Jupyter Notebook. To do this, we open the Jupyter Notebook, navigate to the Table View, select all rows, copy the rows, and paste into a text file.

PacketNumber	Timestamp	Source	Destination	Protocol
1	05:49.5	10.10.1.7	10.10.1.9	HTTP
2	05:50.3	10.10.1.10	10.10.1.3	TCP
3	06:10.3	10.10.1.1	10.10.1.2	HTTP
4	06:10.4	10.10.1.9	10.10.1.3	ICMP
...

The rows will paste as Tab Separated Values (TSV). We will need to convert the rows into Comma Separated Values (CSV). Converting from TSV to CSV is as simple as the following vim substitution command %s/\v\t/,/g.

PacketNumber,Timestamp,Source,Destination,Protocol
1,05:49.5,10.10.1.7,10.10.1.9,HTTP
2,05:50.3,10.10.1.10,10.10.1.3,TCP
3,06:10.3,10.10.1.1,10.10.1.2,HTTP
4,06:10.4,10.10.1.9,10.10.1.3,ICMP
...

Much better. Finally, we save the file to data.csv.

Next, we will spawn an Interactive Ruby session using irb with the csv library preloaded:

$ irb -r csv
irb(main):001> 

Now we will load our data.csv file into a variable:

csv = CSV.read('data.csv', headers: true)

Now we just have to answer the Day 2 questions using pure Ruby.

How many packets were captured (looking at the PacketNumber)?

csv[-1]['PacketNumber']

What IP address sent the most amount of traffic during the packet capture?

csv.group_by { |row| row['Source'] }.max_by { |ip,events| events.count }.first

What was the most frequent protocol?

csv.group_by { |row| row['Source'] }.max_by { |ip,events| events.count }.first

As you can see, you don’t necessarily have to use Python for Data Science. Ruby is more than capable of doing basic Data Science.