Sitemap
OSINT Team

We teach OSINT from multiple perspectives. InfoSec experts, journalists, law enforcement and other intelligence specialists read us to grow their skills faster.

Follow publication

4 min readJul 14, 2023

Open source tool for open source researchers: How to use TG Collector to scrape Telegram channels?

Zoom image will be displayed
Photo by Amin Moshrefi on Unsplash

Despite journalists and researchers who work with open sources are mostly tech-savvy people, not everyone is familiar with using command-line interface (CLI) tools nor have much time to do so. For a researcher, who is working on hundreds of different cases and monitoring online space to document any wrongdoings, war crimes, hate speech or disinformation, it is preferable to have a graphical user interface (GUI) tool to use it, first and most importantly to save time.

As there are a lot of awesome open-source tools and scripts for researchers which focus on Telegram, I want to add a new one to the list. So let me introduce you to a new open-source tool — TG Collector (TGC) — and how to use it.

TGC is a browser based application for scraping (collecting) Telegram messages from the channels. The purpose of this tool is to facilitate the workload of researchers who work with Telegram channels. As it is a tool, not a service, your personal data will not be collected (except anonymous usage statistics, described on the website). While using this tool, all data will be stored in your computer, specifically in your browser. Let’s see the process step by step.

First step — get your API keys

After accessing the tool, you will see the collection section where you can list your channels and start to collect messages from those channels. You can create and name the collection folder, but to start the process you should get your Telegram API.

Zoom image will be displayed

Login popup will show up to direct you to the MyTelegram page where you can get your API keys.

Second step — add your channels

After login, you can start to add channels that you aim to collect messages from. First, create your collection, then insert channel handles (not name). The purpose of collection is to help you organize the channels under respective folders [here collection]. So you can keep separate your channels according to your topic or interest.

After inserting the channels, you will see general information about the channel such as creation date, subscriber number, description, name and handle.

Zoom image will be displayed

If you have dozens or hundreds of channels, you can insert all of them at once by separating them with the comma.

Third step — collect

Once you have collection(s) you can start to collect messages. Select channels that you want to get data from. You can name your project whatever you want. Then choose which fields you want to get data from. For example, if you only need forwards, you can select only “fwdrom” which will give you information such as the URL of the post, forwarded from and to where, and when.

Zoom image will be displayed

Also, you can select all data fields, which will give you a comprehensive overview.

Fourth step — download data

In the respective collection, you will have a second subsection which shows “collected messages”. You will find information about scraping date, status, the number of channels, and messages collected. To download data, you will have two options — JSON and CSV — depending on your need.

Zoom image will be displayed

Feedback and contribution!

As the tool is open-source, you’re free to contribute or take it from here to improve!

--

--

OSINT Team
OSINT Team

Published in OSINT Team

We teach OSINT from multiple perspectives. InfoSec experts, journalists, law enforcement and other intelligence specialists read us to grow their skills faster.

Sayyara Mammadova
Sayyara Mammadova

Written by Sayyara Mammadova

She is a strong technology enthusiast and enjoys figuring out tech based approaches for journalism. She works as a researcher at DFRLab.

Responses (2)