SNG

Integration of Databases of various government departments

SNG

Integrated databases of various government departments, thereby providing Punjab Revenue Authority tax officers

  • A holistic view of the taxpayers footprint across various government databases to verify taxpayers claims.
  • To identify tax evaders Our solution attempts to increase PRA’s tax revenue by amalgamating data from various SNG-connected government departments and utilizing statistical, data visualization, and machine learning techniques to identify tax evaders.
sng

Architecture

  • We set up Apache Hadoop in fully distributed mode on high specs dedicated servers that provide distributed storage called HDFS, and distributed processing with YARN technologies.
  • To increase the speed of Big Data analysis, we have used Apache Spark ( PySpark, Spark SQL ) and Apache Hive on top of the Hadoop cluster.
  • Further, for large-scale data indexing (from Spark) and full-text search, we have deployed Apache Solr in cloud mode on dedicated servers. Lastly, for new incremental data, we have developed API(s) that fetch the incremental data and then put it on HDFS for later processing.
sng