Intruduction:
Recently, in order to better meet the data management needs of AsiaInfo customers and improve the product service capability and business expansion capability of general-purpose database, AntDB, the distributed database of AsiaInfo, released the V7.0 version of its product, which helps operators achieve all-round independent control of their core systems and the smooth launch of their business systems. In the future, there is still a long way to go for the development of domestic database, and distributed database will also play an important role in this process. Analyzing and discussing the development trend and difficulties of domestic distributed database will have some reference and meaning not only for the development of AntDB, but also for the development of domestic database.
Ⅰ. Domestic databases are embracing opportunities
Information innovation industry, that is, information technology application innovation industry, is a national strategy in China in recent years, also a new dynamic force for economic development in today's situation, with the successive introduction of relevant policies and information innovation reform tide, promoting the exploration of secure and controllable core technologies and products gradually becomes a trend. In the field of database, the party and government organs have declared that whey will adopt all domestic database, and it has been piloted in the financial industry. Domestic database vendors follow the trend to accelerate the creation of their own fist products. From 2020 to date, the domestic database has been penetrating from the party and government, finance to communications, energy, transportation, industrial Internet of things and other fields. The market outlook is very promising.
In June 2022, Dameng Data, Transwarp Technology and other database vendors submitted prospectuses one after another to pry tens of billions of market value with 700 million revenue in the domestic science and technology sector. This "signal" indicated that the development of domestic database has entered the expressway. On the one hand, the development of domestic database embraces the opportunity offered by policy. In the past, the domestic database market was monopolized by Oracle, IBM and other enterprises, but in recent years, benefited from the accelerated promotion of policy, the market share of domestic database enterprises has significantly increased. On the other hand, the development of domestic database is also in the market opportunity period. The service mode of database and other basic software gradually develops to cloud. With the large-scale application of cloud computing, database also ushered in a new development opportunity.
Ⅱ. Distributed databases face opportunities and challenges
The database was first born in the 1960s, and the theory of "relational model" proposed by IBM Labs laid the foundation for the relational database technology, which has been popular for nearly 50 years worldwide. With the rapid development of information and communication technology and mobile Internet, business shows the characteristics of highly concurrent reading and writing, massive data processing, heterogeneous data structure. Latter, the post-relational database began to emerge, which made further supplement and improvement to the traditional relational database . That is when distributed database flourished and became well-known.
Compared with traditional relational databases, distributed databases have significant advantages such as smooth scaling, high availability, low cost, etc. If we summarize the comparison of traditional relational databases, non-relational databases and distributed databases, we can get Table 1:
Table 1: Database Comparison
Support or not | Traditional relational database | Non-relational database | Distributed database |
Relational Model | Yes | No | Yes |
SQL statement | Yes | No | Yes |
ACID | Yes | No | Yes |
Horizontal scalability | No | Yes | Yes |
Big data | No | Yes | Yes |
Non-structural | No | Yes | No |
As can be seen from the above table, the distributed database belongs to the result of the evolution of the times, and its functions almost combine the advantages of traditional relational and non-relational databases. However, due to the short development time, its standard system and evaluation system are not sound enough, and the ecological system is not perfect enough. The future development of distributed database in China will certainly have both opportunities and challenges.
2.1 Advantages of distributed database
The three main advantages of distributed database, namely smooth scaling, high availability and low cost, bring great potential for development. The following is a discussion of these three advantages, using the AntDB database of AsiaInfo Technologies as an example.
First, smooth scaling and high performance. In the distributed execution plan, the table data is dispersed on multiple nodes, which greatly reduces the data volume of a single node, and the distributed execution plan can also achieve read/write separation, which helps make full use of storage and computing resources of multiple nodes and effectively improve the throughput of the database. Currently, AntDB can smoothly scale nodes on demand and support hundreds of thousands of, even millions of TPS/QPS (throughput) core requirements for processing.
The second is high availability, which is crucial to enterprise data security and business continuity. The distributed database is designed with primary/backup architecture, and the primary node automatically switches to the backup center when it fails(failover) to ensure the continuous availability of the core system; in addition, each data center guarantees data consistency and transaction integrity through synchronous/asynchronous replication, which does not affect the normal operation of the business in case of failover. AntDB guarantees high availability through multi-copy, distributed transaction processing and other mechanisms, as well as multi-location and multi-center deployment scheme.
Third, low cost. Distributed database is based on a common PC server and operating system, which has a very obvious advantage in hardware cost. In addition, the compatibility rate of AntDB and Oracle database is as high as 96%, which effectively reduces the risk of program migration and the cost of rewriting applications.
2.2 Problems faced by distributed database
Due to the many nodes of distributed database and complex cluster structure, it also has its own shortcomings, plus the fact that distributed database has not been developed for a long time, there are still many problems that need to be solved. One is that according to CAP theory, the distributed database can't perfectly meet customers' multiple characteristic requirements, for instance, some financial core applications require both high consistency and high availability, which may cause customers to give up or reduce some requirements. Second, distributed database operation and maintenance management is rather complex. According to the needs of business nodes, distributed database usually consists of multiple servers, hardware and software operation and maintenance management is often very complex. Third, the maturity of distributed database products needs to be improved. For example, the optimizer, data types, complex queries, custom functions and stored procedures and other advanced features oriented to distributed database are at unbalanced level.
Ⅲ. Practice sharing of AntDB
Compared with foreign mature and stable commercial databases, domestic databases have certain gaps in performance, stability, ecology, etc. The independent and controllable replacement of China's database is not simply one library for another, but a new system to replace the old one, in terms of architecture, research and development, launch, operation and maintenance, etc., to comprehensively reduce the dependence on specific databases, which will be a continuous and difficult process. In this process, AsiaInfo launched AntDB, a domestic distributed database, and the core team has been working hard to add Oracle-compatible features in 2015, realize second-level online expansion in 2017, add kernel-level read/write separation and other functions in 2019, support memory and disk dual engine in 2022, etc. The application fields of AntDB have also expanded from communication to finance, transportation, energy and other industries.
As a milestone achievement, it is of great significance for AntDB being applied in the operator's independent and controllable replacement project: firstly, it explored an independent and controllable database architecture, and eliminated the dependence of applications on specific databases at the architectural level by developing micro-base architecture; secondly, it verified the feasibility of full independence and controllability of database hardware and software. At present, the combined solution of AntDB and Huawei Kunpeng server can replace foreign commercial solutions in the core transaction scenarios of operators. Third, it explores capable database cutover solution based on grayscale release to achieve non-stop, zero-failure database cutover and minimize the business impact of database nationalization replacement.
AntDB distributed database solution has been commercially implemented in the communications industry and has been widely praised by customers. In addition, we judge that we can promote not only in the communications industry, but also in the financial, party and government, energy, postal service and other important infrastructure industries, to accelerate the digital transformation and upgrade of various industries.
In the financial industry, AntDB is applied in the big data system of an insurance company in the north and successfully commercialized, and its successful experience can be extended to other financial or securities business systems with mainly analytical business.
In the government and enterprise industry, AntDB is applied in the highway ETC billing and big data platform of a provincial highway in the south, and the successful experience can be extended to other industries with similar high data concurrency, such as IoT scenarios.
Looking to the future, as AntDB keeps enhancing its versatility, standardization and security, and in order to better serve AsiaInfo's customers, AntDB will continue to strengthen its investment in product development to achieve unified support for multiple data types and multiple business scenarios in one set of database, and to ensure reliable data without loss, error or redundancy, so as to provide customers with high-quality database products.
About AntDB
AntDB was established in 2008. On the core system of operators, AntDB provides online services for more than 1 billion users in 24 provinces across the country. With product features such as high performance, elastic expansion and high reliability, AntDB can process one million core communications transactions per second at peak, ensuring the continuous and stable operation of the system for nearly ten years, and is successfully implemented for commercial purpose in communication, finance, transportation, energy, Internet of Things and other industries.