24 Shared Registration System (SRS) Performance

Prototypical answer:

gTLDFull Legal NameE-mail suffixDetail
.SHOPGMO Registry, Inc.gmoregistry.comView

1. Overview
The Shared Registration System (SRS) is a critical function of a TLD that enables multiple registrars to provision and manage registry objects such as domains, name servers and contacts. The availability and performance of the SRS has an acute impact on registrar business operations, registrants and Internet users, and therefore has a direct effect on the security and stability of the Internet as well as consumer confidence in the DNS. In recognition of that, GMO Registry has deployed a secure, robust and scalable SRS infrastructure to meet the anticipated demands of the .shop gTLD, with ample head room to account for surges and the capability to scale up easily.

2. Architecture

GMO Registry uses the Cloud Registry Management Platform (CRMP) for its SRS implementation. CRMP is a modern implementation of SRS and supporting services for TLD registries. At a high level, the CRMP SRS consists of the following components, which together make up a system that is fully compliant with the gTLD Agreement requirements. Specifically, it conforms to, and is fully interoperable with, the standards in Specification 6. Additionally, the platform is designed to be efficient and horizontally scalable, and when appropriately resourced and operated (see infrastructure and resources) fully capable of exceeding the Service Level Requirement (SLR) stipulated in Specification 10.

![attached: SRS Software Compnent Architecture - 24_1.png](⁄24_1.png)

2.1. Provisioning interfaces

2.1.1. EPP Interface

The EPP implementation conforms to clause 1.2 in Specification 6 of the gTLD Agreement. In particular, it is fully compliant with RFCs 5730, 5731, 5732, 5733, 5910, and 3915. The EPP service is offered as a TCP server protected by TLS (RFC 5734). In addition, all supported extensions comply with RFC 3735 (guidelines for extending EPP.)

The EPP server understands the EPP protocol, and maintains the authenticated EPP session for the lifetime of the connection. It acts as an intermediary, forwarding requests from registrars to the Application Server and relaying the responses back to the registrars. The EPP server uses standard EPP protocol communication with the registrars and an efficient remote procedure call mechanism with the Application Server.

The EPP server only accepts connections from authorized IP addresses advised by registrars.

2.1.2. Web Management Interface

The web based management interface contains a subset of functionalities offered in EPP, along with convenient features such as viewing reports and billing information. It is offered over a secure HTTP interface protected by SSL⁄TLS. It uses the same remote procedure call mechanism as the EPP interface for relaying user requests to the Application Server.

2.2. Internal components

2.2.1. Application Server

The application server implements core SRS business rules which manage the full lifecycle of various registry objects, enforces policies and mediates access to the core SRS database. As part of a fault tolerant, distributed system, it uses the Distributed Coordination Service (DCS) to synchronise access to resources. In order to enhance resiliency as well as to maintain service levels, it also uses the message queue for asynchronous processing of non-timing sensitive or potentially high latency operations.

2.2.2. SRS Database

The SRS database is a logical collection of registry objects and associated data, presented as an interface that allows clients to perform read and write operations to the collection. In concrete terms, it is a cluster of database processes that together provides a scalable, highly available, and fault-tolerant service at the core of the registry.

2.3. Interfaces to other registry systems

2.3.1 DNS Publication Subsystem

The DNS Publication Subsystem is responsible for publishing updates to the SRS database that has material effects on the DNS. These include domain name registration, deletion, updates that causes the set of delegated name servers to be changed, as well as glue record creation, update and deletion.

DNSSEC processing is also handled within the DNS Publication Subsystem, including functions such as key management and zone signing.

Changes effected in this subsystem are replicated to the public authoritative name servers using standard zone transfer mechanisms.

2.3.2. Whois Publication Subsystem

The Whois Publication Subsystem processes data destined for inclusion into the public Registration Data Directory Service, storing it in a highly optimized database for query-only workload consistent with the nature of the Whois service. The database is replicated into public Whois server points of presence over secure VPN tunnels.

2.3.3. Billing & Reporting Subsystem

The Billing and Reporting Subsystem handles transactions within the SRS and any other registry services that are deemed billable. It is responsible for billing functions including financial reconciliation, interfaces to accounting systems, invoicing and business report generation.

2.3.4. Data Export Subsystem

The Data Export Subsystem is a collection of processes that satisfy the following ICANN gTLD contractual requirements: Registry Data Escrow, Zone File Access, and Bulk Registration Data Access. It involves a set of escrow data and zone data generation processes as well as supporting services.

It also includes the ICANN Reports Generation Batch Process a monthly batch job that collects and aggregates data, transaction logs, and metrics from various registry components and produces a set of reports according to Specification 3 of the gTLD Agreement. Generated reports are sent to ICANN via prescribed communication channel.

3. Infrastructure

3.1. Sites

In order to maintain high service levels in the face of catastrophic disasters or outages caused by equipment failure, upstream connectivity or power failure, SRS functions are delivered utilizing fault tolerant mechanisms.

SRS functions during normal operations are delivered from the primary site. In the event of an outage affecting the primary site, all services can be delivered from a geographically diverse, hot standby site.

All systems on the primary and standby backup site are deployed in a fully redundant N+1 setup and transparently withstand catastrophic failure of any single component utilizing load balancers with active health check mechanisms, clustered fault tolerant application servers and transit capacity from multiple independent upstream providers.

Both the primary and standby sites are hosted in telco-grade data centres. More details about the datacentres are specified in Question 32 “Architecture”.

All critical registry data is continuously replicated to the backup site via securely encrypted transmission channels providing a near real-time restore point objective.

![attached: 24_2.png](⁄24_2.png)

3.1.1 SRS Primary Site

More details on the infrastructure hardware are specified in the answers to Question 32 “Architecture”.

Location: Tokyo
Equipment manifest:

2 x Firewalls
2 x Switches
1 x Console Server
9 x Intel Quad Xeon-based Servers
3 x Intel Quad Xeon-based Database Servers
2 x VM Host Servers
1 x HSM
1 x Backup Server

3.1.2. SRS Hot Standby Site

Location: Sydney
Equipment manifest:

2 x Firewalls
2 x Switches
1 x Console Server
3 x Intel Xeon-based Database Servers
6 x VM Host Servers
1 x HSM
1 x Backup Server

3.2. Synchronisation Scheme

The primary and standby sites are connected via a secure VPN tunnel over redundant links. All synchronisation traffic is securely transmitted over the VPN tunnel. Connectivity between sites, as well as synchronisation status of various sub-system components are actively monitored.

3.2.1. SRS Database

The SRS database is held in a master database cluster in the primary site, with a slave cluster mirrored in the standby site. The clusters operate in an active-standby mode, with all data natively replicated from the master to the standby cluster in an asynchronous fashion.

3.2.2. Whois Database

The Whois master database contains a subset of data stored in the SRS database, and if required, can be readily regenerated from the SRS database. There are two instances of the Whois master database - one in the primary site (primary Whois master) and another on the standby site (standby Whois master).

Sychronisation between the primary Whois master database and the standby Whois master as well as public Whois nodes is achieved using master-slave asynchronous replication.

3.2.3. Zone Data

TLD zone data (for both clear and signed copies) is synchronised with the help of standard DNS master slave replication between the BIND instances on the primary site and mirror instances on the standby site.

3.2.4 Applications and Configurations

Software deployments and configuration changes are kept in sync on both sites at the time of deployment or change implementation. This is enforced through strict operational procedures, with monitoring in place to detect and alert any inconsistencies.

3.3 Network

Within each site, the SRS network architecture is depicted as follows:

![attached: SRS Conceptual Architecture - Provisioning - 24_3.png](⁄24_3.png)

![attached: SRS Conceptual Architecture - Publication - 24_4.png](⁄24_4.png)

4. SRS Performance

GMO Registry will comply with all SRS performance specifications set forth in Specification 10 of the gTLD Agreement.

In order to meet the availability and response time service levels defined in Specification 10, it is necessary to take into account the various components that constitute the SRS as a whole.

4.1. Availability

The necessary preconditions for maintaining availability of the SRS are as follows:
a. the EPP interface and all of the supporting components must be end-to-end operational; and
b. the round trip time of any given EPP command must not be more than 5 times higher than the corresponding SLR defined in Specification 10. While the actual requirement for EPP service availability in Specification 10 is more lenient than what is stated here, response times that exceed normal levels are almost certainly an indication of a fault or capacity issue in some underlying components that has the potential of compounding to a loss of availability. As such, instances of EPP commands that exhibit such behaviours are investigated and dealt with urgently.
In general, the following strategies are employed to safeguard the availability of the SRS:
internal redundancy - to ensure that any single component failure in any layer of the architecture will not result in an outage
network redundancy - multiple upstream transit providers mitigates the risk of connectivity issues
site redundancy - with geographically distinct hot standby site helps mitigate prolonged outage of the SRS in case of catastrophic failure involving an entire data centre or geographic region
software architecture - with clear separation of concern between decoupled components each with different load characteristics facilitates horizontal scaling to upgrade capacity of the affected components
hardware scalability - computing resources are virtualised with headroom in the host system to allow operational flexibility to add capacity as needed
monitoring - collects performance and system metrics and alerts abnormal resource usage levels helps to proactively detect and identify issues so that prompt corrective actions can be taken, as well as to facilitate capacity planning
capacity planning - well-defined plan to periodically review system performance and plan upgrades well before performance is affected.

4.2. EPP Command Response Times

In general, EPP command response time is the sum of the processing times required by each component responsible for processing the request, as well as the internal network round trip time required. The following tabulates the components and their respective overhead in our SRS implementation:

EPP Servers Processing overhead: decoding of SSL frames, session management, EPP message processing
Application Servers Processing overhead: command processing logic including validation and business rules.
Database Servers Processing overhead: performing read and write transactions involving disk I⁄O, intra-cluster communication for managing consistency

GMO Registry optimizes EPP command response times by employing the following strategies:
defer operations that can be performed in the background, removing them from the critical command processing path. Examples of such operations are: billing, DNS updates, Whois update
employ efficient coding techniques such as safe caching of static data, parallelizing operations where possible, lightweight compression to minimize expensive network round trip (relative to CPU time required for compression⁄decompression)
careful tuning of components to cater to the load characteristics

4.2.1. Service Level Parameters

The following service level parameters are defined in Specification 10. GMO Registry is aware that for the purposes of determining service levels, the following round-trip time (RTT) measurements are timed from the perspective of the testing probes, and includes the network latency of the link between the probes and the SRS infrastructure. While this effectively increases the rigor of the requirements, GMO Registry is committed to exceeding the SLRs, and uses significantly more stringent thresholds for monitoring and internal targets. EPP Query Command RTT

Specification 10 defines this to be the “RTT of the sequence of packets that includes the sending of a query command plus the reception of the EPP response for only one EPP query command.” The Service Level Requirement for this parameter is “≤ 2000 ms, for at least 90% of the commands”, as seen from the testing probes.

GMO Registry’s SRS implementation is highly optimized for performance. Query command processing is efficient and requires minimal resources. In addition, it is able to linearly scale query commands to increase concurrency by adding capacity.

Production and load tests log analysis confirms that GMO Registry’s SRS infrastructure has the capability to handle query commands well within 100ms at 160 transactions per second. EPP Session Command RTT

Specification 10 defines this to be the “RTT of the sequence of packets that includes the sending of a session command plus the reception of the EPP response for only one EPP session command. For the login command it will include packets needed for starting the TCP session. For the logout command it will include packets needed for closing the TCP session.” The Service Level Requirement for this parameter is “≤ 4000 ms, for at least 90% of the commands”, as seen from the testing probes.

From a processing perspective, session commands have a similar load profile to that of EPP Query Commands, with the overhead of TCP and SSL establishment and teardown. The former involves handshake packets, which may amplify any latency issue between the testing probes and the SRS infrastructure.

The SRS has the capability to handle session commands well within 100ms at 210 transactions per second. EPP Transform Command RTT

Specification 10 defines this to be the “RTT of the sequence of packets that includes the sending of a transform command plus the reception of the EPP response for only one EPP transform command.” The Service Level Requirement for this parameter is “≤ 4000 ms, for at least 90% of the commands”, as seen from the testing probes.

GMO Registry’s SRS implementation is highly optimized for performance. Transform commands are by nature more complex than queries, and involve database write operations with transaction management within the database cluster. For performance and fault tolerance, GMO Registry utilizes workers to perform operations in the background. This removes potentially expensive operations from the processing pipeline, improving performance and reliability.

In addition, the GMO Registry database seamlessly scales write operations linearly with cluster size, so that capacity can be added on demand. Please refer to the answers to Question 33 “Database” for more information.

The SRS has the capability to handle transform commands well within 500ms at 150 transactions per second.

4.3. Capacity

GMO Registry anticipates that .shop will utilize less than 35% of the infrastructure capacity reserved for .shop during the first 3 years of operation. Individual systems are upgraded when utilization reaches 60%.

GMO Registry conducts frequent utilization assessments to determine the need for infrastructure upgrades.

Please refer to the “Scale, Projection and Costs” section of the answers to Question 31 “Registry Overview” for more details.

5. Resourcing Plans

The implementation and operation of this aspect of registry operations fall under the areas of responsibility of the following roles:
Technical Manager
Network Engineer
Applications Engineer
Database Administrator
System Architect
System Administrator
Technical Support
Security Officer
QA and Process Manager

The attached table, “resource_fte.png”, outlines the overall FTE equivalent resources available to GMO Registry for the initial implementation and ongoing operations of the registry, of which SRS Performance is a subset.

5.1. Initial Implementation

Initial implementation of SRS Performance refers to:
implementation and deployment of the infrastructure and systems outlined in this document, a significant portion of which already exists in the current GMO Registry production infrastructure;
implementation of performance and availability monitoring in support of the goals outlined in this document, that are not already in the current production monitoring setup.

During this phase, all roles listed above are involved in the planning and implementation of their respective systems in support of this component.

5.2. Ongoing Maintenance

The ongoing maintenance of SRS Performance involves:
monitoring of performance and availability objectives
analyzing systems behavior and optimizing application, configuration and architecture to yield improved performance
conducting proactive maintenance according to the standard operational procedures

All roles listed above are involved in this phase of the operations.

Similar gTLD applications: (26)

gTLDFull Legal NameE-mail suffixzDetail
.SHOPGMO Registry, Inc.gmoregistry.com-2.56Compare
.MAILGMO Registry, Inc.gmoregistry.com-2.54Compare
.hitachiHitachi, Ltd.hitachi.com-2.54Compare
.datsunNISSAN MOTOR CO., LTD.thomsonbrandy.jp-2.54Compare
.sharpSharp Corporationgmoregistry.com-2.54Compare
.yokohamaGMO Registry, Inc.gmoregistry.com-2.54Compare
.infinitiNISSAN MOTOR CO., LTD.thomsonbrandy.jp-2.54Compare
.nissanNISSAN MOTOR CO., LTD.thomsonbrandy.jp-2.54Compare
.suzukiSUZUKI MOTOR CORPORATIONgmoregistry.com-2.54Compare
.nagoyaGMO Registry, Inc.gmoregistry.com-2.54Compare
.mtpcMitsubishi Tanabe Pharma Corporationgmoregistry.com-2.54Compare
.DNPDai Nippon Printing Co., Ltd.mail.dnp.co.jp-2.54Compare
.GGEEGMO Internet, Inc.gmoregistry.com-2.54Compare
.osakaGMO Registry, Inc.gmoregistry.com-2.54Compare
.canonCanon Inc.web.canon.co.jp-2.54Compare
.GMOGMO Internet, Inc.gmoregistry.com-2.54Compare
.konamiKONAMI CORPORATIONkonami.com-2.54Compare
.otsukaOtsuka Holdings Co., Ltd.otsuka.jp-2.54Compare
.INCGMO Registry, Inc.gmoregistry.com-2.54Compare
.nhkJapan Broadcasting Corporation (NHK)internet.nhk.or.jp-2.54Compare
.toshibaTOSHIBA Corporationgmoregistry.com-2.54Compare
.TOKYOGMO Registry, Inc.gmoregistry.com-2.54Compare
.kddiKDDI CORPORATIONgmoregistry.com-2.54Compare
.greeGREE, Inc.gmoregistry.com-2.54Compare
.okinawaBusinessRalliart inc.gmoregistry.com-2.51Compare
.ryukyuBusinessRalliart inc.gmoregistry.com-2.51Compare