24 Shared Registration System (SRS) Performance
|gTLD||Full Legal Name||E-mail suffix||Detail|
TABLE OF CONTENTS
24.1 SHARED REGISTRY SYSTEM
24.1.1 EPP Software
24.1.3 EPP System Performance
24.1.4 Registrar Web Interface
24.2 RESOURCING PLANS
24.2.1 Human Resources
24.3 ABOUT THIS RESPONSE
〈〉 - - - - - 〈〉
24.1. SHARED REGISTRY SYSTEM
Our Shared Registry System (SRS) is a set of registrar-facing servers, each offering an Extensible Provisioning Protocol (EPP) interface and a Registrar Web Interface. These servers interact with a high-performance redundant back-end system that applies registry policy, and that in turn uses a distributed database master⁄slave cluster.
Our high-performance database system, built on PostgreSQL, uses live streaming replication to update multiple servers simultaneously. Any of these replicas can be promoted to full read⁄write operation if necessary, replacing the master server in case of a failure affecting that master.
Non-master servers share the load of database read-only operations, giving the overall database cluster a significant speed advantage over traditional single-server systems; however, write operations are exclusively and carefully routed to the master server to ensure consistency. Changes to the master are replicated to the other servers after being committed at the master.
Each system component has been engineered to be fully redundant. The hardware used to host them has been designed to allow for single component failures, including power supplies, hard drives, and network interface cards.
In addition to that fully redundant hardware, the software that implements each service has been engineered so that it can operate as multiple instances of that service running on multiple servers. This ability to replicate components (both hardware and software) is critical to the scalability of any high volume, critical service. By having replicated services, the overall system may be geographically distributed among several data centers, which gives multiple-site redundancy to the registry and its services.
The servers in our system all use the Network Time Protocol (NTP) to synchronize their internal clocks, enabling the proper recording of audit traces so that event correlation can be performed at any instant. We operate our own Stratum 1 clock which (along with access to several other Stratum 1 clocks operated by other organizations) gives all of our servers a highly consistent, reliable, and accurate time baseline.
The internal communication between a server and the back-end system is performed with authenticated Representation State Transfer (RESTful) operations layered on HTTPS transport.
Our systems support both IPv4 and IPv6. Our registration systems and name server systems are fully DNSSEC compliant.
For information on Whois and DNS service implementation and performance, please refer to our answer to questions 26 and 35.
24.1.1. EPP Software
Internet Systems Consortiumʹs Shared Registry System (ISC-SRS) is a software package created from the ground up. Its design is based on the set of requirements specified in the ICANN gTLD Applicantʹs Guidebook and on our previous experience with registry software. This is a second generation implementation; its predecessor (OpenReg) had been an earlier initiative within ISC to provide an Open Source registry solution.
OpenReg was used by a few organizations, but mostly became a reference implementation for various software implementers who used it as the platform on which to build their solutions.
The experience gained from OpenReg, and the incorporation into ISCʹs staff of engineers and operations staff with background in registry operations, has led to the creation of a new and innovative system that makes use of modern software technologies to achieve simplicity, scalability, and resiliency with high performance and high integrity.
126.96.36.199. Software Development Process
ISC makes use of the Agile Software Development methodology, advocating frequent releases in short development cycles, with the intention of increasing productivity and introducing checkpoints where new requirements can be adopted.
Our methodology focuses on having extensive code reviews (by at least one other person), unit testing of all code, measurability by code-coverage analysis, and prioritizing based on functionality.
188.8.131.52.1. Coding Guidelines
For ISC-SRS, we follow the guidelines and recommendations of the BIND 9 Coding guidelines as published in http:⁄⁄bind10.isc.org⁄wiki⁄BIND9CodingGuidelines
184.108.40.206.1.1. Code Review
As is standard for all ISC software development, code must be reviewed by peers before it can be considered ready for production testing. This ensures that coding guidelines have been followed, the code is consistent and factored appropriately.
220.127.116.11.1.2. Code Versioning
Source code is stored using the GIT version control system, which is widely used by many organizations that do software development. Among them are: Linux Kernel, Perl, Eclipse, Gnome, KDE, PostgreSQL, etc.
The main features of this system are:
* Distributed development: provides a full copy of the entire repository for developers;
* Strong support for non-linear development: rapid support for branching and merging;
* Efficient handling of large projects;
* Cryptographic authentication of history; and
* Easy integration with existing unix command line tools.
18.104.22.168.2. Software Testing
22.214.171.124.2.1. Unit Tests
Unit Testing is where software development starts. Once there is a clear definition of the functionality that we want to achieve, we proceed to write code that is capable of executing in small or factored pieces.
When the tests have been written, we run them, and verify that all of them fail.
Once the testing framework for a specific project has been put into place, we write the actual implementation, making sure that there is complete coverage of what the tests will be looking for.
When the code has been written, the tests can verify each specific code function, always aiming at a 100% source code coverage.
Having a full set of unit tests written from the beginning enables us to do regression testing, a vital step to make sure that new code additions do not break existing functionality.
126.96.36.199.2.2. Protocol Conformance Tests
In addition to Unit Testing, we run a different set of tests, which allow us to verify that the code conforms with protocol specifications.
To achieve this, we have developed a set of testing tools to check for specific use cases, as defined in Request For Comments (RFC) documents.
As with Unit Tests, it enables regression testing and backwards compatibility with changes going forward.
The ISC-SRS system is a high performance, scalable platform built for simultaneous support of multiple TLDs. ISC-SRS has the ability to scale with its customer base by adding resources (processors, storage, memory, network capacity) incrementally as needed.
The initial platform for Uniregistry consists of two live sites located at:
* Palo Alto, CA, US (Master); and
* Chicago, IL, US (warm standby).
A third site (another warm standby) is planned to be deployed as soon as the risk assessment, to be performed during mid 2012, is complete.
Each site is fully capable of handling the entire largest-case expected load of SRS operations, and it is designed to be fully redundant from its component base to its network connectivity.
For Internet connectivity the ISPs that provide connectivity to our sites meet the following set of requirements:
* IPv4 and IPv6 native connectivity
* Ability to deliver multiple links through different network paths
* Ability to use an IGP (Internal Gateway Protocol), to automatically re-route traffic
* Presence at chosen data centers
* Good peering diversity
* Not provide Internet transit services to another of our registry sites
Complementing the ISP-provided Internet access two private links are provisioned between registry sites, providing a redundant path with consistent low latency for the purpose of guaranteeing database replication and front-end to back-end communication.
See EXHIBIT: ʺ24-Diagram-Network-Sites.pngʺ for a visual description of the connectivity for the sites.
On each site, connections are terminated in two routers having:
* One Internet Transit Link for Management
* One Internet Transit Link for Services
* One Inter-Site link
* One connection towards the other router
* One connection towards each firewall.
The routers provide the ability to adapt the network to link failures, providing the best available path for inter-site traffic and the rest of the world (Internet).
188.8.131.52. Site Configuration
The configuration for each registry site is as follow:
* Two (2) Routers
* Two (2) Network Firewalls
* Two (2) Network Switches
* Two (2) Local Load Balancers
* One (1) Global Load Balancer
* Two (2) High Performance Database Servers
* Two (2) EPP Servers
* Two (2) Whois Servers
* Two (2) Back-end Servers
* Two (2) Management Servers
* One (1) DNS Server
* Two (2) Hardware Security Modules (HSMs)
DNS Services are provided using ISC-SNS platform, which is an Anycast cloud-based service that disseminates and serves DNS information using a globally distributed system.
There is also an additional DNS server per site, in Unicast mode, for additional redundancy and diversity.
24.1.3. EPP System Performance
Our platform is not yet in use for this TLD, so we cannot measure performance with live traffic. We have done performance testing using a laboratory system to determine the load limits of our system and platform.
The laboratory testing platform on which these measurements were made consisted of only one server with low I⁄O performance (only one hard drive) that was executing three competing system components: EPP front-end, SRS back-end, and the database system.
Our laboratory performance measurements understate the performance that will be delivered in full production. As compared to the lab systems, the production EPP servers have higher capacity (more CPU cores, more RAM, faster memory bus). Further, the production servers are dedicated systems that do not perform any other functions that compete for resources.
The hardware profile of the servers is expected to change over the lifetime of the project as industry trends change. At the time of this writing, production servers are based on x86 architecture and fitted with enough RAM and storage capacity to run the expected workloads with a considerable excess capacity.
To measure performance, the platform was provisioned in a lab under the following conditions:
* Both the client and the server were located in the same room
* A 1-GigE switch was used to provide network connectivity
* No firewall or packet filtering was used for the test
* Each server had two network cables attached to the switch:
** (1) for Management
** (1) for Client-Server communication
184.108.40.206. Server Specifications
The hardware chosen for the laboratory testing platform differs from the production architecture. For actual deployment we use a multi-level system to provide load balancing in conjunction with high performance dedicated database systems to minimize the impact of Input⁄Output bottlenecks. Our test system did not have any of that extra capacity or high-performance backend.
Our test platform is thus less powerful than that which we will actually use. We anticipate that in full production, our platforms will exhibit much higher performance capacity.
Yet even on this scaled-down test platform we easily meet the requirements of Specification 10. The test configuration can be seen at EXHIBIT: ʺ24-Table-Server-Specifications.pdfʺ
To test the platform, the performance software (perf-epp) simulated a registrar performing a high volume of operations on the server. Performance metrics were collected to measure the number of transactions per second (TPS) as well as the round trip time (RTT) of each individual operation.
The ʹperf-eppʹ software simulated:
* 1 registrar sending a stream of operations;
* 2,4,8,16,32,64 and 128 parallel connections performing operations.
Once the client has been initialized with the command line options, it proceeds to:
* Start Phase 1, which:
** Generates a 1 client connection to the EPP server
** Generates a random list of 500 unique domain names
** Generates 500 EPP ʺcreateʺ,ʺinfoʺ and ʺdeleteʺ commands
** Send all the ʺcreateʺ, ʺinfoʺ and ʺdeleteʺ commands and store performance metrics in a file
** Continue to Phase 2
* Phases 2-8 perform the same steps as above, but using 2,4,8,16,32,64 and 128 parallel connections to the EPP server.
* Once all the processes have completed, a post-processing routine analyzes the files to produce the statistical results of the operation.
* For the purpose of measuring the EPP session commands, we decided to time how long it would take to perform an EPP ʺloginʺ command on the server.
The perf-epp collected the ʺloginʺ RTT as it progressed through the test.
220.127.116.11.1. Session Command
All the session initiation commands were performed under the SLA as specified in the Specification 10 document in the Applicantʹs Guidebook.
See EXHIBIT: ʺ24-Chart-EPP-Graph-login.pngʺ for a visual interpretation of the data generated by the test.
The Y-axis represents the Round-trip-Time (RTT) time in seconds, while X-axis the time in seconds elapsed during the test. The blue ʹplottedʹ line, shows the round-trip times as observed by the tool. The red line, shows the SLA is the Service Level Agreement as shown in Specification 10.
18.104.22.168.2. Query Commands
To test the EPP query operations, the perf-epp system uses the ʺinfoʺ command.
See EXHIBIT: ʺ24-Chart-EPP-rtt-graph-query.pngʺ for the round trip times observed during the test. The test shows consistent response times under the SLA specification in the Applicantʹs Guidebook.
See EXHIBIT: ʺ24-Chart-EPP-qps-graph-query.pngʺ for the simultaneous number of queries that the test system was able to perform with 1 to 128 simultaneous clients.
22.214.171.124.3. Object Transformations
Object creation and deletions are part of the EPP transform commands. EXHIBIT: ʺ24-Chart-EPP-rtt-graph-create.pngʺ and EXHIBIT: ʺ24-Chart-EPP-rtt-graph-delete.pngʺ graphically compare the results of the performance test against the SLA in specification 10 of the Applicantʹs Guidebook.
EXHIBIT: ʺ24-Chart-EPP-qps-graph-create.pngʺ and EXHIBIT: ʺ24-Chart-EPP-qps-graph-delete.pngʺ show the number of transactions per second (TPS) that the system was able to achieve during simultaneous client connections. To stress the system, 1,2,4,8,16,32,64 and 128 simultaneous were used.
The results of these tests clearly show that our system exceeds the SLA specification for transform operations as defined in specification 10.
126.96.36.199.4. Server Load
During the time the test was running, we collected system performance data using the linux command vmstat. Among other variables, this command reports on CPU usage (both user and system), which we used to observed how the system behaved under heavy load.
After reviewing the performance data, the server maintained an average 75% load during EPP ʺcreateʺ and ʺinfoʺ queries, but later the CPU usage went high when performing massive EPP ʺdeleteʺ commands. This is easily remedied by separating functions not competing on CPU resources on the front-end.
See EXHIBIT: ʺ24-Chart-EPP-Vmstat-graph.pngʺ.
24.1.4. Registrar Web Interface
The registrar web interface complements the functionality provided by the underlying EPP protocol, with rich features to instill confidence in the registrar, aimed at providing:
* Management of Internet Objects (hosts, domain and contacts)
* Administrative information such as:
** Balance checks
** Financial reporting
* Operational and statistical reporting:
** Registry services status
** DNS platform stability and performance
* Registrar messages and event notifications
* Registry support services
As with the the EPP Interface, the system uses a multi layer authentication system that includes:
* TLS Certificate to identify the client
* User⁄password combination to authenticate access the system
* Devices accessing for the first time into the system must be approved by an email challenge sent to the registrar operational contact
* Any changes to Internet Objects within the registry will also require an email challenge sent to the operational contact
Since this web interface uses a communication channel specifically designed for registrars, we can be innovative with regard to the services offered through and we can adapt to new challenges and technologies as they come along.
This service is intended to be used as a backup or in unusual circumstances by the registrars; we do not envision high load demands over it.
24.2. RESOURCING PLANS
Costs and procurement of the resources described here are detailed in response to Question 47.
24.2.1. Human Resources
The resourcing plan specific to this response follows the principles, guidelines and information set forth in our response to Question 23.
The accompanying chart shows the human resources allocated to the functions depicted in this response.
See EXHIBIT: ʺ24-Chart-Resourcing.pngʺ for human resource assignments.
24.3. ABOUT THIS RESPONSE
We believe this answer is complete and must be awarded the full score:
* We includes a complete description of the SRS solution proposed by ISC, including a high-level description of the system, a representative network diagram, a table with the initial number of servers in the solution, a description of the interconnections with other components, a discussion about the synchronization scheme -- both time and database synchronization are explained early in the response -- including the frequency of synchronization (instantaneous and continuous) and the synchronization scheme (live streaming replication).
* This solution provides high performance, high integrity, and high availability - it provides full service through foreseeable failures and includes sufficient resilience and backup to assure business continuity.
* The system architecture is readily expansible: it can obtain increased performance by adding new servers.
* This solution meets, and exceeds, the performance requirements of Specification 10.
* We fully support EPP, DNSSEC, IPv4, and IPv6.
Similar gTLD applications: (57)
|gTLD||Full Legal Name||E-mail suffix||z||Detail|
|.locus||Locus Analytics LLC||tyemill.com||-2.47||Compare|
|.москва||Foundation for Assistance for Internet Technologies and Infrastructure Development (FAITID)||sedari.com||-2.44||Compare|
|.MOSCOW||Foundation for Assistance for Internet Technologies and Infrastructure Development (FAITID)||sedari.com||-2.44||Compare|