Standardisation Approach
A detailed insight into the open standards principles adopted to establish DIGIT platform specifications
Effective co-creation and partnerships in developing digital public goods are based on solid frameworks and principles that define the rules of co-creation. The DIGIT platform is designed using open-source standards that allow other stakeholders in the ecosystem to contribute and build innovative governance solutions.
The published DIGIT specifications and open standards principles help in identifying and explaining the module taxonomies, data modelling approach and API data interactions. This section provides a detailed insight into the DIGIT platform specifications and open standards approach in general.
API Data Interaction Specifications
Taxonomy Specifications
The past couple of decades has seen an increasing thrust from governments to build digital infrastructures that are scalable and follow open standards. Frameworks, institutional mechanisms, and standards are necessary to bring comprehensiveness to the eGovernance initiatives. It can also help drive efficiencies that enable large-scale transformations. These frameworks and standards are also crucial for bringing us closer to building our technological sovereignty.
There is an established need for standardized knowledge systems that can aid in developing sustainable digital solutions. To build standards that can provide context and boundary to an otherwise fragmented knowledge base of ‘Urban Governance,’ Taxonomies and Ontologies have been identified as essential tools. They are integral to the standards that evolve.
Approach
Creating taxonomies is foundational in any knowledge system, and this document attempts to be a repository for standardized nomenclature and classifications in different e-governance application domains. Taxonomies help build robust frameworks and guidelines that support specific operations in context. It aids them in creating workable solutions down the line that can be adopted across different ULB jurisdictions across India.
The taxonomies defined as part of DIGIT specifications establish key principles in the context to various elements linked to the specific module. And, this knowledge is essential to bring in a uniform understanding of the domain and its diverse components.
Scope
The taxonomy is developed to meet the practical need to carve out a harmonized view of different elements of the module. It tries to capture the most important entities, their properties, categories, subcategories, parameters, and specifications within this domain as well as other associated areas.
Capturing abstract knowledge across a domain and rendering a uniform perspective of this knowledge is the core challenge in the development of knowledge representation tools like taxonomies and ontologies. Taking this into consideration, a conceptual framework has been followed, which helped in enforcing a systematic approach to building the taxonomies.
This framework follows a bottom-up approach and allows inductive inferencing while developing these knowledge models. The framework has seven broad components, and they are elucidated using an example of building a taxonomy for college education in India.
Domain – A domain outlines the scope of the knowledge area around which the taxonomies and ontologies are to be defined. In our example, the Indian Education System can be marked as the domain.
Sub Domain- A Sub Domain denotes a subset of the knowledge area. It narrows down the scope of the knowledge area and helps in establishing a clear context. A domain may consist of more than one sub-domain, and these sub-domains can be overlapping or mutually exclusive. For instance, the domain of the Indian Education System can consist of several sub-domains like - Higher Education Systems, Primary Education, Elementary Education, Integrated Child Welfare Schemes (Anganwadis), etc. To analyze college education in India narrowing it down to the subdomain of the Higher Education System, will bring more focus.
Entity- Entities in a domain define the nouns in that knowledge area. They are the essential building blocks in any knowledge model. While building taxonomies and ontologies, identification of the key entities and defining them is a critical step. In a college taxonomy, Infrastructure, Learning Aids, Teaching Staff, Non-Teaching Staff, Governing Bodies, Pedagogy, and Academic Subjects are examples of a few entities that surface. Each of these entities does not stand in isolation and is usually supplemented with peripheral associations and terms. This discovery process is iterative and usually spawns a universal terminology for a domain.
Note: Entities in this context are terms and should not be confused with the more prevalent concept of entity relationships (ERs).
Properties - The Entities that are associated with a Domain/Sub Domain are analyzed to identify the properties, processes, services and other aspects related to them. In the example above, if we pick an Entity like Teaching Staff, the associated properties could be Qualifications, Subjects, No of years of experience etc. This exercise helps in bringing clarity about the entities and help in categorizing and identifying parameters and specifications.
Categories- The Entities and their properties will form logical clusters which can be carved into categories and sub-categories. The higher or more generic category subsumes the sub-categories within it leading to hierarchical relations. When organized together as a whole they yield taxonomies. With the example above there could be many categories around each of these entities.
Values- These are the most granular units within a taxonomy. They are usually the assignments in the last sub-category within a branch of a taxonomy. In the current example, consider analyzing the entity - Learning Aids; as we build the taxonomies around it, the possible values that can evolve could be visual aids, recorded audio, textbooks, and so on.
Parameters and Specifications– These do not form part of the taxonomy but carry essential information. This information may be used to execute a process or provide a service. For example, student admissions can have specifications like age, address etc.
Design
To ensure this taxonomy fits the needs of interested stakeholders the following principles have been followed in designing it. The principles are explained in context with the help of inferences from the Property Tax module for illustration.
Usable: Property Tax is predominantly a local tax falling under the jurisdiction of State or ULBs and is governed by State Acts. The local options regarding the tax will be heterogeneous. To fit the multivariate property tax systems in India the taxonomy has been designed to be coarse-grained, light-weight and minimalistic.
Evolvable: The taxonomy is designed to evolve over a period of time thereby adapting to changing needs and emerging technologies. For example, “Collection Channels'' for Property Tax are defined to accommodate any future advancements in the digital payments space.
Modular: The categorizations in the taxonomy are designed modularly, yet they function together as a whole. They are independent and self-contained and may be combined and configured with similar units to achieve a different outcome. These categorizations can be unbundled into multiple simple categorizations. They can be further re-bundled in a modular manner to suit disparate contexts. For example, the Property “Usage” element and its sub-classifications can be easily reapplied in the context of any Building Plan Approval System.
Extendible: The taxonomy is designed to be exhaustive and the elements of Property Tax are positioned in a hierarchy that can accommodate both horizontal and vertical additions. This leaves room for wider adoption and innovation to suit the contexts of any ecosystem. Also as stated earlier in the document the end goal is to build a knowledge practice that supports Open Standards, and taxonomy is an entry point. As it evolves it will be used as a basis to build ontologies around the Property Tax domain.
Open: The taxonomy has been designed to be ‘open’ to allow publishing under the most unrestricted licenses to enable wider ecosystem participation and foster innovations.
Data Modelling Specifications
The Data Model standards and specifications help in explaining the DIGIT module design specifications that further translate the conceptual rules into recommendations, requirements, and restrictions.
Approach
Data models offer a three-dimensional view of what data needs to be collected, how is the database structured, and how the data entities across segments map to one another. The principles and specifications in this section help in defining the key standards applicable across the DIGIT platform to ensure interoperability across the platform and consistency in database design.
Versioning: The design of the data models takes into consideration the fact that these will evolve with time and also allow contributions from multiple stakeholders. The version upgrades will be added to this site for reference guidelines.
Scope
The data model design and specifications are based on the guidelines listed below.
Open: In order to maintain technology and vendor neutrality, DIGIT data models -
are published under the Creative Commons Attribution 4.0 International License
does not assume or require the choice of any proprietary technology
is designed and ratified through a multi-stakeholder process
is developed/reviewed/adopted by a group of experts through a consensus-based process
Evolvable: To adapt to changing needs over time, data models -
Are versioned with backward compatibility of at least up to 1 major version.
Is protocol-agnostic to facilitate innovation in protocols and support solution use cases over multiple protocols (e.g. governance application over chat interfaces). In other words, they follow The Rule of Least Power. In order to achieve this, data models carry all the necessary information –
request metadata e.g. authentication token, device information, signatures
response metadata e.g. response signatures, processing status, correlations ids
error data e.g. error codes, messages etc.
in its own structures, rather than depending on protocol-specific fields/headers to carry such information.
Extendable: Given the diversity of India’s urban systems, it is important to ensure that standards do not limit the ability of solution providers to develop solutions that meet local needs. Further to enable innovation, standards should not be restrictive in their specification and application. Thus standards need to be extendable to enable ecosystem actors to innovate and build locally relevant solutions. Therefore, data models -
extend from existing international/national standards like National Municipal Accounting Manual (NMAM) wherever possible
allow adding optional business extension elements
such extension elements are clearly documented in the same manner as base models and APIs and made available on a public repository
Minimalistic: To enable ecosystem actors to easily adopt standards while empowering them to innovate, data models -
contain minimal mandatory fields in data models
require only the most fundamental API operations to enable faster compliance while fulfilling needed functional requirements. The consistent pattern in API operations on various entities makes them simple to adopt.
avoid including attributes/APIs needed for specific solutions that are not yet known to be applicable to wider solutions.
Balance Data Privacy with Data Empowerment: To leverage the power of data while ensuring safe usage, data models -
require minimal personally identifiable information (PII) to be collected mandatorily thereby reducing risk to PII data
provide policy-based access control to enable the creation, modification and sharing of data as needed
include provision for data anonymization and proxy fields for PII and other sensitive data
Provide for non-repudiability: To ensure the right attribution for the data, data models -
declare access mechanisms for APIs
provide digital signatures
provide APIs for accessing data access information
Unbundled: To provide the most fundamental building blocks while ensuring minimalism, extensibility and evolvability, data models -
limit the mandatory information in data models
Design
Given the role and significance of the DIGIT platform in building citizen-centric applications and technology-based solutions for the community, it is crucial to define the data modelling standards. Standardized data aids in interoperability, and improves accuracy while ensuring consistency.
Data Models are broken down into simpler fundamental units (models) as far as possible. For example, a property assessment model is basically a property model and assessment model with the assessment model referring to the property model.
Data models include a Universally Unique Identifier (UUID) field which uniquely identifies the object of the respective entity type within the respective domain. Data models require minimal mandatory fields to enable maximal inclusion. As a thumb rule, wherever unsure whether a field is absolutely required in all scenarios, it should be made optional.
Data models extend/ reuse/ adopt international/ national models wherever available/applicable e.g. Open311 for citizen services like grievances and schema.org for general model definitions.
Data models are extensible i.e. they allow a way to capture extra information that was not initially included during the model design. To achieve this: It may provide a simple map of key-value pairs. Future versions of the data model may choose to create mandatory/optional name attributes in the data models after researching the wider applicability of such fields.
Data models allow for namespacing in field names to indicate the source/reason/category of the extended fields.
Last updated