Information environment based on the principles of Open Data

Information environment based on the principles of Open Data

The proposed information environment is a kind of decentralized social network. But unlike many existing solutions, this environment has a number of useful properties in addition to decentralization and is based on fairly simple and standard technical solutions (email, json, text files and a bit of blockchain). That allows anyone with basic programming knowledge to create their own services for this environment.

Universal Identifier

In any online environment, user and object identifiers are one of the key elements of the system.

In this case, the user identifier is email, which has actually become a generally accepted identifier for authorization on sites and other services (jaber, openId).

In fact, the user ID in this online environment is a login + domain pair, which, for convenience, is written in the form familiar to most. At the same time, for greater decentralization, it is desirable for each user to have their own domain. Which is close to the principles of indivib, where a domain is used as a user identifier. In our case, the user adds a nickname to his domain, which allows you to create multiple accounts on the same domain (for friends, for example) and makes the addressing system more flexible.

This user ID format is not tied to any network. If the user places his data in the TOR network, then you can use domains in the .onion zone, if this is a network with a DNS system on the blockchain, then domains in the .bit zone. As a result, the addressing format of users and their data does not depend on the network through which they are transmitted (everywhere the login+domain binding is used). For those who want to use a bitcoin / ethereum address as an identifier, you can modify the system to use pseudo email addresses of the form [email protected]

Object addressing

This online environment is actually a set of objects that are described in a structured machine-readable form, refer to other objects and are tied to a specific user (email) or project/organization (domain).

urns in the urn:opendata namespace are used as object identifiers. For example, a user profile has an address like:

urn:opendata:profile:[email protected]

The user's comment has an address of the form:

urn:opendata:comment:[email protected]:08adbed93413782682fd25da77bd93c99dfd0548

where 08adbed93413782682fd25da77bd93c99dfd0548 is a random sha-1 hash that acts as the object id, and [email protected] is the owner of this object.

The principle of publishing user data

Having a domain under control, a user can simply publish their data and content. And unlike indibeb, it doesn't require you to create a website with html pages embedded with semantic data.

For example, basic information about the user is placed in the datarobots.txt file, which is located at the address of the form

http://55334.ru/[email protected]/datarobots.txt

And it has the following format:

Object: user
Services-Enabled: 55334.ru,newethnos.ru
Ethnos: newethnos
Delegate-Tokens: http://55334.ru/[email protected]/delegete.txt

That is, in fact, it is a set of strings with key-> value data, parsing which is a simple task for any person with basic programming knowledge. And you can edit the data if you wish through a regular notepad.

More complex data (profile, comment, post, etc.) that has its own urn is returned as a JSON object using the standard API (http://opendatahub.org/api_1.0?lang=ru), which can be located as on the user's domain, and on a third-party site to which the user has delegated the storage, publication and editing of their data (in the Services-Enabled line of the datarobots.txt file). About such third-party services - it is written below.

Simple ontology and JSON

The ontology of the communication environment is relatively simple compared to the ontologies of industry knowledge bases. Since in the communication environment there is a relatively small set of standard objects (post, comment, like, profile, feedback) with a relatively small set of properties.

Therefore, to describe objects in such an environment, it is enough to use JSON instead of XML, which is more complex in structure and parsing (it is important not to forget about the need for a low entry threshold and scalability).

To get an object with a known urn, you need to contact the user's domain, or third-party services to which the user has delegated the management of his data.

In this online environment, each domain on which the online service exists also has its own datarobots.txt located at an address like example.com/datarobots.txt with similar content:

Object: service
Api: http://newethnos.ru/api
Api-Version: http://opendatahub.org/api_1.0

From which we can learn that you can get data about the object at the address of the form:

http://newethnos.ru/api?urn=urn:opendata:profile:[email protected]

JSON object has the following structure:

{
    "urn": "urn:opendata:profile:[email protected]",
    "status": 1,
    "message": "Ok",
    "timestamp": 1596429631,
    "service": "example.com",
    "data": {
        "name": "John",
        "surname": "Gald",
        "gender": "male",
        "city": "Moscow",
        "img": "http://domain.com/image.jpg",
        "birthtime": 332467200,
        "community_friends": {
            "[email protected]": "1",
            "[email protected]": "0.5",
            "[email protected]": "0.7"
        },
        "interests_tags": "cars,cats,cinema",
        "mental_cards": {
            "no_alcohol@main": 8,
            "data_accumulation@main": 8,
            "open_data@main": 8
        }
    }
}

Service architecture

Third-party services are needed to simplify the process of publishing and searching for data by end users.

Above was mentioned one of the types of services that help the user to publish his data on the network. There may be many similar services, each of which provides the user with a convenient interface for editing one of the types of data (forum, blog, question-answer, etc.). If the user does not trust third-party services, then he can install a data service script on his domain or develop it on his own.

In addition to services that allow users to publish / edit data, the online environment provides a number of other services that perform relatively complex tasks that are quite problematic to implement on end user nodes.

One of the types of such services are data hubs ( opendatahub.org/en - example), acting as a kind of web archive that collects all public machine-readable user data and provides access to them via API.

The presence of services in such an open decentralized online environment significantly reduces the entry threshold for users, since there is no need to install and configure your own node. At the same time, the user retains control over his data (at any time, you can change the service to which the publication of data is delegated or create your own node).

If the user is not interested in owning his data at all and he does not have his own domain or is familiar with the domain, then by default opendatahub.org manages his data.

At whose expense is all this?

Perhaps the main problem of almost all such decentralized projects is the impossibility of monetizing them at a level sufficient for stable development and support.

Donate + tokens are used to cover development and marketing costs in this online environment.

All donations that users make to internal projects/services are public, machine-readable, and email-bound. This allows them to be taken into account, for example, when calculating an intranet social rating and published on user pages. When donations cease to be anonymous, then in fact users do not donate, but β€œthrow off” to support a common information environment. Just like people donate to repair common areas with the appropriate attitude towards those people who refused to donate.

In addition to donations, limited tokens (400.000) are used to raise funds, which are credited to everyone who makes donations to the main fund (ethnogenesis).

Additional features of tokens

Each token is a β€œkey” for access to this online environment. That is, you can use the services and be part of the online environment only if you have at least 1 token that is linked to email.

Tokens are a good spam filter due to their limited nature. The more users in the system, the more difficult it is to get a token and the more expensive it is to create bots.

People, their data and social connections are more important than technology

The described online environment is technically a relatively primitive solution. But more important in it are not so much technology as people and the social connections and data (content) created within the environment.

The created social community, whose members have their own universal identifiers (email and their own domain) and structured data (with URN addresses, ontology and JSON objects), when a better technical solution appears, can transfer all this data to another online environment, while maintaining the formed connections (ratings, ratings) and content.

This post describes one of the elements of a network self-organized community, which, in addition to a decentralized online environment, includes a number of offline areas that increase the benefits of the online environment and are β€œcustomers” that largely determine its functionality. But these are already topics for other articles that are not directly related to IT and technology.

Source: habr.com

Add a comment