Skip to content

pkgforge-security/subxtract

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ℹ️ About

Wrapper around go-fasttld for extracting TLD (Top-Level-Domains) & ROOT_DOMAIN using accurate and always-uptodate public-suffix list from domains, urls, ipv4, ipv6, etc.
Tools like tomnomnom/unfurl rely on Hardcoded Regexes and thus fail to properly separate tld-suffix from actual domains.
Instead, go-fasttld uses Public Suffix List which is autoupdated regularly.

🖳 Installation

Use soar & Run:

soar add 'subxtract#github.com.pkgforge-security.subxtract'

🧰 Usage

subxtract --help
Public-Suffix based TLDs (Top-Level-Domains) & Root Domain Extractor

Usage:
  subxtract [flags]

Flags:
  -c, --concurrency int     Limit the number of concurrent goroutines (Higher CPU/RAM Usage) (default 1000)
  -d, --domains             Print the root domain and suffix combined
  -f, --file string         Input file containing URLs|Domains (one per line)
  -h, --help                help for subxtract
  -i, --ignore-subdomains   Ignore (Exclude) subdomains
  -j, --json                Output in JSON Format (Everything)
  -s, --private-suffix      Include Private Suffix (Example: blogspot.com)
  -p, --punycode            Convert Internationalized Domain Names (IDN) to Punycode (ASCII Characters)
  -r, --roots               Print only the root domain (without suffix)
  -t, --tlds                Print only the Top Level Domain (TLD)

Examples:

  • Input

cat domains.txt

  • You can also pass urls, doesn't really matter
  • Don't pass wildcards (.*)
abc.example.com
abc.example.net
apple.com
be.banana.com
example.com
example.com.np
example.net
example.org
xyz.abc.example.com
xyz.abc.example.net

  • To Extract Root Domain Names ( NO .tld)
subxtract -f domains.txt -r | awk '!seen[$0]++'
!# Or Via STDIN
cat domains.txt | subxtract -r | awk '!seen[$0]++'
  • Output

example
apple
banana

  • Similarly, To Extract Root Domain Names + Top Level Domains (TLDs)
subxtract -f domains.txt -d | awk '!seen[$0]++'
!# Or Via STDIN
cat domains.txt | subxtract -d | awk '!seen[$0]++'
  • Output

example.com
apple.com
banana.com
example.net
example.org
example.com.np

About

Extract TLD (Top-Level-Domains), Root Domains & More using public-suffix List [Maintainer=@Azathothas]

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •