To OSINT and Beyond!
Open-Source Intelligence (OSINT) can be valuable for an organization and penetration testing engagements in several ways. Today, let me highlight two areas: Leaked Credentials and Files.
As part of any security engagement, it is ideal, if not essential, that we look up our target’s leaked credentials and files, as many clients do not have a high level of visibility or awareness in this area. However, if we can provide an answer to those questions, then it could be an eye-opener and hopefully help cultivate better security hygiene within their organization.
Our all-seeing pentest eye perspective not only shows us a glimpse of a client’s security posture but can also indicate additional avenues or attack vectors to explore. This information can potentially let us gain an initial foothold or discover information we can use against our targets.
During our penetration testing engagements, there are moments when we stumble upon leaked files or credentials that can be leveraged. This gives us that exuberating feeling whenever we discover something relevant and useful, especially if we’re coming from a ‘black box’ or ‘gray box’ testing approach. However, it's essential to remember that we shouldn't rush to judgment, even if we find leaked credentials or files. We don't subscribe to the notion that "once a leaker, always a leaker." Instead, our assessment considers whether an organization has learned from past mistakes. Either way, we're here to help. After all, there are numerous reasons for data leakage.
It is amazing to witness the evolution of search engines and file sharing. Back in that era and still to date, we often hear the phrase “Google is your friend.” and I agree, most of the time. In the early days of the Internet, before Google and Facebook dominance, there were few other go-to tools for finding information, Friendster, MySpace, MSN, AltaVista, Yahoo, Excite, and Lycos, to name a few.
Even before the rise of these search engines, there was another realm of online exploration that captured the imagination of many in file and data sharing. Some that come to mind are Gnutella, Napster, LimeWire/FrostWire, BearShare, and, of course, Torrent apps, which are peer-to-peer file sharing protocol (P2P) applications. Aside from those, one of my favorites is the Internet Relay Chat (IRC). Its primary use is for real-time text-based chat. Some IRC networks and channels allow file sharing, enabling users to exchange files directly with others. IRC bots were popular in those days for automating lots of stuff.
A simple example of OSINT-ing is simply performing a web search. In security lingo, we call this “Google Dorking” or “Google Hacking”. The term made popular and credited by many to Johnny Long, a professional hacker and security researcher, He authored a book about it "Google Hacking for Penetration Testers" and developed the Google Hacking Database (GHDB), a collection of Google dorks and search queries for finding vulnerable websites and information. It has played a significant role in spreading awareness about Google Dorking. It provides a repository of search queries that can be used to discover vulnerabilities, exposed databases, and sensitive information.
We were very lucky in this example. During a recursive search for the target domain(s), results showed the majority of the targets were crawled by Google, including static authentication tokens and other static parameters which would allow one-click-access to the application.
Figure 1. Command-line output of Google cli query which led to discovery of authentication tokens and other static user parameters
As we can see, the tokens are usable. This allows us to access the authenticated sections of one of our target applications which serves as our initial foothold for this engagement.
Figure 2. Authenticated interface of the application after using tokens harvested from Google
Unfortunately, in this case we reported the leaked credentials issue in 2021 and it is sad to mention that the issue still exists.
Using these leaked credentials, we could find a valid Microsoft (MS) account and gain remote access or perform additional reconnaissance like Users/Domain enumeration upon obtaining valid credentials.
Figure 3. Example of credentials checking via Microsoft login
Below is an example of a search graph. Most of the time large chunks of leaked credentials, along with corresponding passwords, are circulated with ‘unknown’ source(s).
Figure 4. Example of results: Statistics of leaked credentials by source
Figure 5. Overview of leaked credentials showing the majority of leaked credentials have passwords included
Here is another example. Upon initial reconnaissance, there were few leaked credentials found on the Dark Web. Fortunately, one was usable, giving us access to the authenticated surfaces of the target. In this example, we underscore the value of Two-Factor Authentication (2FA), or Multi-Factor Authentication (MFA) which adds an extra layer of security to accounts by requiring users to provide two or more forms of verification factored into usernames and passwords.
Figure 6. Two leaked credentials discovered via OSINT
Figure 7. Authenticated section of the target application after the successful login using the leaked credentials
As we can see, the password was last updated nearly a decade ago. This reveals that old, leaked credentials can still be exploited. It highlights the fact that even the presence of a single compromised credential can lead to a data breach, potentially exposing highly sensitive data. This further highlights the importance of having a proper user account policy, encompassing various security considerations, including the enforcement of strong password requirements, regular password expiration policies, and the implementation of additional security layers such as MFA and 2FA.
There are trade-offs between transparency and security. In fact, even username and email formats can make a substantial difference in account security, for instance, mbarao@<domain>.com vs. midelbarao@<domain>.com. The former conceals the user’s first name. The latter format is more useful for targeted and mass phishing attacks.
Figure 8. Example of accessed user data including payment history and details
Leaked Files and Assets
Figure 9 shows a massive collection of leaked files from cloud storage.
Figure 9. Example list of leaked files
Due to bad asset management practices, policy or configuration errors, countless files containing sensitive data are leaked from cloud storage like: Azure, AWS S3, Digital Ocean, Google, etc.
In one engagement we conducted the client mentioned that due to its large size there could be "unknown" assets and asked us to find and identify them.
Here is an example of the reconnaissance conducted.
Figure 10. Command-line output of a query enumerating files/storages related to a target organization
Figure 11. Viewing the number of leaked files from various insecure cloud storage services
Subsequent reconnaissance shows a handful of related “unknown” cloud storage instances, including one that is publicly readable. In this example, we have successfully achieved our goal of discovering assets and files of which they were previously unaware.
Figure 12. Discovered cloud storage with publicly accessible files
Further analysis of the downloaded files showed Application Programming Interface (API) credentials for a domain owned by the target organization.
Figure 13. Source code review of the downloaded files revealed API credentials
Sometimes, post reconnaissance requires additional enumeration automation. For this instance, we had to modify an existing Burp Extension and add active checks for our purpose.
Some examples of additional checks are:
- Check deleted cloud storage being referenced by target(s). This is useful for automating the take-over attacks.
- Reading files from any identified Public Buckets
- Automating force uploading of test files
Figure 14. Example output of detecting insecure cloud storage
Dumped credentials and files could end up on the Dark Web for sale. Some credentials and files are leaked publicly depending on the malicious actor’s motivation.
We have access to vast resources that we can use to check leaked credentials. Back in the day, we used weleakinfo until it got seized. Currently, leakcheck.io and dehashed are two popular tools to check for leaked credentials . Shodan and Censys are also valuable tools for cybersecurity professionals and researchers that help discover vulnerable devices and systems.
Addressing credentials and data leaks requires a comprehensive approach that includes proper policies, training, monitoring, and regular assessments. Ultimately, every organization must prioritize data security and embed vigilance in all development and procedures to all assets to mitigate the associated risks.
The importance of OSINT in our penetration testing efforts cannot be overstated. Leaked credentials and files are often a treasure box of information that could reveal an organization's vulnerabilities. We analyze all gathered data to identify additional attack surfaces that an actual malicious actor could exploit, providing our clients invaluable insights to further fortify their defenses.
In conclusion, the immense power of OSINT in the realm of security/penetration testing lies not only in the information it uncovers, but also in its capacity to cultivate better security hygiene embedded in any organization's process, elevating overall awareness, and fostering a safer and more secure digital landscape for all.
I’ll end with this phrase, “Artificial Intelligence (AI) is your friend.” Is it? It is not a remote idea anymore that AI could change the way we live, cyberspace included.
Food for Thought
The integration of AI into OSINT promises to shape the future of intelligence gathering and analysis. There are interesting papers available elaborating on the growing importance and concerns of using AI in Open-Source Intelligence (OSINT) in contemporary intelligence practices. These research papers stress the need for a framework to validate AI-powered OSINT to address concerns like transparency, privacy, misinformation, and to address responsible and ethical use of AI in intelligence. AI technology for OSINT in the security paradigm is inevitably a path we are now traversing, involving data collection, analysis, and cyber security.
That’s about it! I hope you find this information useful, and as always, thank you for reading 😊.
Happy Hunting!
ABOUT TRUSTWAVE
Trustwave is a globally recognized cybersecurity leader that reduces cyber risk and fortifies organizations against disruptive and damaging cyber threats. Our comprehensive offensive and defensive cybersecurity portfolio detects what others cannot, responds with greater speed and effectiveness, optimizes client investment, and improves security resilience. Learn more about us.