{"id":377,"date":"2015-04-06T02:10:33","date_gmt":"2015-04-06T02:10:33","guid":{"rendered":"http:\/\/sushain.com\/blog\/?page_id=377"},"modified":"2024-01-30T18:44:32","modified_gmt":"2024-01-31T00:44:32","slug":"publications","status":"publish","type":"page","link":"https:\/\/sushain.com\/blog\/publications\/","title":{"rendered":"Publications"},"content":{"rendered":"<blockquote>\n<p style=\"text-align: left;\">Following is a representative list of publications, book chapters and articles that I&#8217;ve co-authored in data and analytics space. For a comprehensive listing with citations, see <span style=\"text-decoration: underline;\"><a href=\"https:\/\/scholar.google.com\/citations?user=kFftOGYAAAAJ&amp;hl=en\" target=\"_blank\" rel=\"noopener\">Google Scholar<\/a><\/span>\u00a0.<\/p>\n<p style=\"text-align: left;\">I also have <strong>100+<\/strong> pending\/granted patents, some of which are listed <span style=\"text-decoration: underline;\"><a href=\"http:\/\/sushain.com\/blog\/patents\/\" target=\"_blank\" rel=\"noopener\">here<\/a><\/span>.<\/p>\n<\/blockquote>\n<p>1. <strong><span style=\"text-decoration: underline;\"><a href=\"https:\/\/www.igi-global.com\/article\/principled-reference-data-management-for-big-data-and-business-intelligence\/165392\" target=\"_blank\" rel=\"noopener\"><span class=\"patent-title\">Principled Reference Data Management for Big Data and Business Intelligence<\/span><\/a><\/span><\/strong><\/p>\n<p><span class=\"patent-title\"><span class=\"patent-title\"><span class=\"patent-title\"><a href=\"https:\/\/www.igi-global.com\/article\/principled-reference-data-management-for-big-data-and-business-intelligence\/165392\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" loading=\"lazy\" class=\"alignright wp-image-1186\" src=\"http:\/\/sushain.com\/blog\/wp-content\/uploads\/2024\/01\/ijoci.webp\" alt=\"\" width=\"200\" height=\"285\" \/><\/a><span id=\"ctl00_cphFeatured_lblAbstract\"><\/span><\/span><\/span><\/span><span class=\"patent-title\"><span class=\"patent-title\"><span class=\"patent-title\"><span id=\"ctl00_cphFeatured_lblAbstract\">Most large enterprises requiring operational business processes utilize several thousand instances of legacy, upgraded, cloud-based, and\/or acquired information management applications. With the advent of Big Data, Business Intelligence (BI) systems, receive unconsolidated data from a wide-range of data sources with no overarching governance procedures to ensure quality and consistency. Although different applications deal with their own flavor of data, reference data is found in all of them. <\/span><\/span><\/span><\/span><\/p>\n<p><span class=\"patent-title\"><span class=\"patent-title\"><span class=\"patent-title\"><span id=\"ctl00_cphFeatured_lblAbstract\">Given the critical role that BI plays in ensuring business success, the fact that BI relies heavily on the quality of data to ensure that the intelligence being provided is trustworthy, and the prevalence<\/span><\/span><\/span><\/span><span class=\"patent-title\"><span class=\"patent-title\"><span class=\"patent-title\"><span id=\"ctl00_cphFeatured_lblAbstract\"> of reference data in the information integration landscape, a principled approach towards management, stewardship and governance of reference data becomes necessary to ensure quality and operational excellence across BI systems. <\/span><\/span><\/span><\/span><\/p>\n<p><span class=\"patent-title\"><span class=\"patent-title\"><span class=\"patent-title\"><span id=\"ctl00_cphFeatured_lblAbstract\">The authors discuss this approach in context of typical reference data management concepts and features, leading to a comprehensive solution architecture for BI integration.<\/span><\/span><\/span><\/span><\/p>\n<hr \/>\n<p>2.\u00a0 <strong><a href=\"https:\/\/www.slideshare.net\/slideshow\/embed_code\/key\/49LGHXPWpYxdUl\" target=\"_blank\" rel=\"noopener\"><span style=\"text-decoration: underline;\">Technical Strategies for Information Virtualization<br \/>\n<\/span><\/a><\/strong><a href=\"https:\/\/docs.google.com\/viewer?url=patentimages.storage.googleapis.com\/pdfs\/US20140164399.pdf\"><span style=\"text-decoration: underline;\"><br \/>\n<\/span><\/a>Over the years, the information landscape within many organizations has become increasingly complex. More information is used by <a href=\"http:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-05-at-11.46.07-PM.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignleft size-full wp-image-384\" src=\"http:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-05-at-11.46.07-PM.png\" alt=\"Screen Shot 2015-04-05 at 11.46.07 PM\" width=\"573\" height=\"327\" srcset=\"https:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-05-at-11.46.07-PM.png 573w, https:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-05-at-11.46.07-PM-300x171.png 300w\" sizes=\"(max-width: 573px) 100vw, 573px\" \/><\/a>more applications that are reaching more people. Mergers, acquisitions and global expansion meld disparate systems within enterprises and expand them regionally. Throughout the world, new business models have been built around easy access to and sharing of information.<\/p>\n<p>Greater data accessibility provides new opportunities, but much of the data within organizations is accessible only to those who know where the information is, how to get it and how to use it. The growing value of underutilized information has led to a range of architectural strategies, technologies and patterns to harness their potential.<\/p>\n<p><a href=\"http:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-05-at-11.44.06-PM.png\"><img decoding=\"async\" loading=\"lazy\" class=\" wp-image-383 alignright\" src=\"http:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-05-at-11.44.06-PM.png\" alt=\"Screen Shot 2015-04-05 at 11.44.06 PM\" width=\"567\" height=\"275\" srcset=\"https:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-05-at-11.44.06-PM.png 679w, https:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-05-at-11.44.06-PM-300x146.png 300w\" sizes=\"(max-width: 567px) 100vw, 567px\" \/><\/a>Collectively, this set of capabilities can be called information virtualization (IV). IBM uses this term instead of data virtualization because IV encompasses not only the traditional structured data, but all varieties of data\u2014an important distinction since data with less structure has become more commonplace with the rise of big data.<\/p>\n<p>This paper discusses key IV technologies and offers several ways to form technical strategies for specific situations. It focuses on a pragmatic approach to evaluating business constraints, requirements and the technical environment in developing an architecture to meet current and planned information needs.<\/p>\n<hr \/>\n<p>3.\u00a0 <strong><a href=\"http:\/\/www.cs.iastate.edu\/~honavar\/Papers\/pandit-ictai10.pdf\" target=\"_blank\" rel=\"noopener\"><span style=\"text-decoration: underline;\">Ontology-guided Extraction of Complex Nested Relationships<em><br \/>\n<\/em><\/span><\/a><\/strong><a href=\"https:\/\/docs.google.com\/viewer?url=patentimages.storage.googleapis.com\/pdfs\/US20140164399.pdf\"><span style=\"text-decoration: underline;\"><br \/>\n<\/span><\/a>Many applications call for methods to enable automatic extraction of structured information from unstructured natural language text. <a href=\"http:\/\/dx.doi.org\/10.1109\/ICTAI.2010.98\"><img decoding=\"async\" loading=\"lazy\" class=\" wp-image-390 alignleft\" src=\"http:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-12.27.11-AM.png\" alt=\"Screen Shot 2015-04-06 at 12.27.11 AM\" width=\"206\" height=\"66\" \/><\/a>Due to inherent challenges of natural language processing, most of the existing methods for information extraction from text tend to be domain specific. We explore a modular ontology-based approach to information extraction that decouples domain-specific knowledge from the rules used for information extraction. We describe a framework for extraction of a subset of complex nested relationships (e.g., Joe reports that Jim is a reliable employee). The extracted relationships are output in the form of sets of RDF (resource description framework) triples, which can be queried using query languages for RDF and mined for knowledge acquisition.<\/p>\n<hr \/>\n<p>4.\u00a0 <strong><a href=\"http:\/\/aisel.aisnet.org\/cgi\/viewcontent.cgi?article=1037&amp;context=amcis2012\" target=\"_blank\" rel=\"noopener\"><span style=\"text-decoration: underline;\">Metadata Exploitation in Large-scale Data Migration Projects<\/span><\/a><a href=\"http:\/\/aisel.aisnet.org\/amcis2012\/proceedings\/DataInfoQuality\/6\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" loading=\"lazy\" class=\" wp-image-393 alignright\" src=\"http:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-12.39.57-AM.png\" alt=\"Screen Shot 2015-04-06 at 12.39.57 AM\" width=\"254\" height=\"75\" srcset=\"https:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-12.39.57-AM.png 380w, https:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-12.39.57-AM-300x88.png 300w\" sizes=\"(max-width: 254px) 100vw, 254px\" \/><\/a><a href=\"http:\/\/aisel.aisnet.org\/amcis2012\/proceedings\/DataInfoQuality\/6\" target=\"_blank\" rel=\"noopener\"><br \/>\n<\/a><\/strong><a href=\"https:\/\/docs.google.com\/viewer?url=patentimages.storage.googleapis.com\/pdfs\/US20140164399.pdf\"><br \/>\n<\/a>The inherent complexity of large-scale information integration efforts has led to the proliferation of numerous metadata capabilities to<a href=\"http:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-12.38.13-AM.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignleft wp-image-391\" src=\"http:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-12.38.13-AM.png\" alt=\"Screen Shot 2015-04-06 at 12.38.13 AM\" width=\"648\" height=\"131\" srcset=\"https:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-12.38.13-AM.png 856w, https:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-12.38.13-AM-300x61.png 300w\" sizes=\"(max-width: 648px) 100vw, 648px\" \/><\/a>\u00a0 improve upon project management, quality control and governance.<\/p>\n<p>In this paper, we utilise complex information integration projects in the context of SAP application consolidation to analyse several new metadata capabilities, which enable improved governance and control of data quality.<\/p>\n<p>Further, by investigating certain unaddressed aspects around these capabilities, often tending to negatively impact information integration projects, we identify key focus areas for shaping future industrial and academic research efforts.<\/p>\n<hr \/>\n<p>5.\u00a0 <strong><a href=\"http:\/\/www.redbooks.ibm.com\/abstracts\/sg248084.html?Open\" target=\"_blank\" rel=\"noopener\"><span style=\"text-decoration: underline;\">Practical Guide to Managing Reference Data with IBM InfoSphere Reference Data Management Hub<\/span><\/a><a href=\"http:\/\/aisel.aisnet.org\/amcis2012\/proceedings\/DataInfoQuality\/6\" target=\"_blank\" rel=\"noopener\"><br \/>\n<\/a><\/strong><a href=\"https:\/\/itunes.apple.com\/us\/book\/practical-guide-to-managing\/id646328664?mt=11\"><img decoding=\"async\" loading=\"lazy\" class=\" wp-image-394 alignleft\" src=\"http:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-12.47.19-AM.png\" alt=\"Screen Shot 2015-04-06 at 12.47.19 AM\" width=\"173\" height=\"295\" srcset=\"https:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-12.47.19-AM.png 194w, https:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-12.47.19-AM-176x300.png 176w\" sizes=\"(max-width: 173px) 100vw, 173px\" \/><\/a><\/p>\n<p>IBM InfoSphere Master Data Management Reference Data Management Hub (InfoSphere MDM Ref DM Hub) is designed as a ready-to-run application that provides the governance, process, security, and audit control for managing reference data as an enterprise standard, resulting in fewer errors, reduced business risk and cost savings.<\/p>\n<p>This IBM Redbooks publication describes where InfoSphere MDM Ref DM Hub fits into information management reference architecture. It explains the end-to-end process of an InfoSphere MDM Ref DM Hub implementation including the considerations of planning a reference data management project, requirements gathering and analysis, model design in detail, and integration considerations and scenarios. It then shows implementation examples and the ongoing administration tasks.<\/p>\n<p>This publication can help IT professionals who are interested or have a need to manage reference data efficiently and implement an InfoSphere MDM Ref DM Hub solution with ease.<\/p>\n<hr \/>\n<p>6.\u00a0 <strong><span style=\"text-decoration: underline;\"><a href=\"http:\/\/aisel.aisnet.org\/cgi\/viewcontent.cgi?article=1038&amp;context=amcis2012\" target=\"_blank\" rel=\"noopener\">Ontology-guided Reference Data Alignment in Information Integration Projects<\/a><\/span><\/strong><\/p>\n<p>One of the hard problems in information integration projects (harmonizing data from various legacy sources into one or more targets) <em><a href=\"http:\/\/aisel.aisnet.org\/amcis2012\/proceedings\/DataInfoQuality\/7\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" loading=\"lazy\" class=\" wp-image-393 alignright\" src=\"http:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-12.39.57-AM.png\" alt=\"Screen Shot 2015-04-06 at 12.39.57 AM\" width=\"268\" height=\"79\" srcset=\"https:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-12.39.57-AM.png 380w, https:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-12.39.57-AM-300x88.png 300w\" sizes=\"(max-width: 268px) 100vw, 268px\" \/><\/a><\/em>is the appropriate alignment of reference data values across systems.<\/p>\n<p>Without this <a href=\"http:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-1.01.59-AM.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignleft wp-image-397\" src=\"http:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-1.01.59-AM.png\" alt=\"Screen Shot 2015-04-06 at 1.01.59 AM\" width=\"455\" height=\"266\" srcset=\"https:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-1.01.59-AM.png 728w, https:\/\/sushain.com\/blog\/wp-content\/uploads\/2015\/04\/Screen-Shot-2015-04-06-at-1.01.59-AM-300x176.png 300w\" sizes=\"(max-width: 455px) 100vw, 455px\" \/><\/a>alignment, the process of loading records into the target systems might fail because the target might reject any record with an unknown reference data value or different underlying data semantics.<\/p>\n<p>Today, detecting reference data tables and determining the relative alignment between a source and a target is largely manual, cumbersome, error-prone and costly.<\/p>\n<p>We propose a semantic approach to detect reference data tables and their relative alignment across source\/target systems to enable semi-automated creation of translation tables.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Following is a representative list of publications, book chapters and articles that I&#8217;ve co-authored in data and analytics space. For a comprehensive listing with citations, see Google Scholar\u00a0. I also have 100+ pending\/granted patents, some of which are listed here. 1. Principled Reference Data Management for Big Data and Business Intelligence Most large enterprises requiring <a class=\"read-more\" href=\"https:\/\/sushain.com\/blog\/publications\/\">[&hellip;]<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0},"_links":{"self":[{"href":"https:\/\/sushain.com\/blog\/wp-json\/wp\/v2\/pages\/377"}],"collection":[{"href":"https:\/\/sushain.com\/blog\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/sushain.com\/blog\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/sushain.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sushain.com\/blog\/wp-json\/wp\/v2\/comments?post=377"}],"version-history":[{"count":36,"href":"https:\/\/sushain.com\/blog\/wp-json\/wp\/v2\/pages\/377\/revisions"}],"predecessor-version":[{"id":1206,"href":"https:\/\/sushain.com\/blog\/wp-json\/wp\/v2\/pages\/377\/revisions\/1206"}],"wp:attachment":[{"href":"https:\/\/sushain.com\/blog\/wp-json\/wp\/v2\/media?parent=377"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}