Client API Reference, v2.2.5

Table of Contents

dse.proto

Top

dse.proto

A low level data model and APIs for describing an Keysight AI/ML experiment

Must immediately follow header comments (no blank lines)

AbortTrialRequest

Request to abort a trial

Currently, it will try to abort only the current running config, which is already stored in DSE

AbortTrialResponse

Contains information about a trial abort

FieldTypeLabelDescription
state AbortState

Return the state

message string optional

A message containing relevant information for the abort.

CollectiveImplementations

Wrapper message for gRPC response.

FieldTypeLabelDescription
collective_implementations common.CollectiveImplementation repeated

List of collective implementations

CreateBindingRequest

A set of RDMA packet match conditions.

FieldTypeLabelDescription
infrastructure_profile profiles.InfraProfile

The infrastructure profile to use for the binding.

prev_binding bind.Binding optional

previous binding (if any), useful for incremental changes to existing bindings

platform_regions bind.PlatformRegion repeated

Assigns platforms to different regions of the infrastructure. DSE server will raise an error if this is non-empty and feature flag onearm is not being used.

platform common.PlatformType

The platform type. Typically obtained from CreateBinding()

CreateBindingResponse

Response containing a list of trial reports

FieldTypeLabelDescription
binding bind.Binding

The created binding object

GetDiagnosticFileRequest

Request to get diagnostic files for specified trials.

FieldTypeLabelDescription
result_ids string repeated

result ids can be obtained from the trial_report, provided by ListTrialReports api call If result_ids is empty, then it will collect diagnostics for the currently configured trial list of result ids to include in the diagnostic file

GetDiagnosticFileResponse

Response containing a list of trial reports

FieldTypeLabelDescription
filepath string

filepath to the archived log files (zip, tar, etc.)

url string

URL to download the diagnostic file

RunLogs

Message in a stream of updates returned while running a trial.

The client can print the log messages returned by each update to

have a live indication of progress.

FieldTypeLabelDescription
log_messages string

Sequence number of the log message.

timestamp google.protobuf.Timestamp

The timestamp signifies the time at which the log message was generated.

severity_level SeverityLevel

The severity level of the log message.

component_name string

The name of the component emitting the log message.

Trial

Trial is a message that contains all the required messages used to define a trial run.

FieldTypeLabelDescription
workspace WorkspaceSpec

The workspace that the trial will be stored under. If the workspace does not exist, it will be created.

tags string repeated

A list of tags associated with this trial.

platform common.PlatformType

The type of platform to run the workload over

nccl_config common.NcclConfig

configuration settings specific to nccl

tcp common.TcpTransport

TCP transport configuration

rocev2 common.Rocev2Transport

RoCEv2 transport configuration

falcon common.FalconTransport

BEGIN PRIVATE Falcon transport configuration

kccb kccb.Config

KCCB configuration

workload_replay workload_replay.Config

Workload replay configuration

binding bind.Binding optional

typically obtained from CreateBinding()

trial_meta TrialMeta

hold version info

impairments impairment.Impairments

List of impairment metrics

TrialMeta

holds the model version and maybe more in the future

FieldTypeLabelDescription
model_version string

The unique identifier for the trial

is_readonly bool

The is_readonly is set if the config cannot be used to run a trial. E.g. part of the config not exposed by the UI is removed before saving. The configuration can be inspected in the UI, but a trial will fail to run.

TrialReport

Contains information about a trial that has run, is running, or has yet to be started.

FieldTypeLabelDescription
timestamp google.protobuf.Timestamp

The timestamp signifies the time at which a trial run was started. In storage, each trial directory is named according to the following format: ISO 8601 formatted timestamp YYYY-MM-DDTHH:mm:SS.ms:timezone

workspace string

workspace name

path string

The storage path of the trial directory.

tags string repeated

Tags associated with the trial.

description string

Description of the trial

state TrialState

Stores the current state of this trial.

system_tags string repeated

System tags associated with the trial.

end_timestamp google.protobuf.Timestamp

The timestamp signifies the time at which a trial run was completed.

kccb_summary app_common.SummaryTable

A table of nccl-like summary results

workload_replay_summary app_common.SummaryTable

A table of workload replay summary results

kccb_artifacts app_common.TrialArtifacts

A collection of artifacts generated by the trial that are saved in storage.

workload_replay_artifacts app_common.TrialArtifacts

A collection of artifacts generated by the trial that are saved in storage.

use_report_v2 bool optional

A flag to indicate whether to use the v2 report format

info_report_v2 storage_v2.TrialReportDetailInfo optional

Detailed report information in v2 format

WorkspaceSpec

Describes a workspace to be created/updated.

If the workspace already exists, the list of tags will be appended to that

workspace if they are not already attached to that workspace.

FieldTypeLabelDescription
name string

Workspace name

tags string repeated

List of tags to allow filtering trial results

AbortState

State of an abort request.

NameNumberDescription
ABORT_UNDEFINED 0

Undefined state

ABORT_INITIATED 1

The abort request has been initiated

ABORT_ERROR 2

An error occurred during the abort process

SeverityLevel

Contains information about the severity level of a log message.

NameNumberDescription
LEVEL_UNSPECIFIED 0

Unspecified severity level

LEVEL_DEBUG 1

Debugging information

LEVEL_INFO 2

Informational messages

LEVEL_WARNING 3

Warning messages

LEVEL_ERROR 4

Error messages

LEVEL_CRITICAL 5

Critical error messages

TrialState

State of a trial run.

NameNumberDescription
UNSPECIFIED 0

Trial state not specified, invalid value

UNCONFIGURED 1

Trial is not configured

CONFIGURATION_IN_PROGRESS 2

Trial configuration is in progress

CONFIGURATION_SUCCESSFUL 3

Trial configuration completed successfully

RUN_IN_PROGRESS 4

Trial run is in progress

RUN_SUCCESSFUL 5

Trial run completed successfully

ERROR 6

Trial has encountered an error

ABORTED 7

Trial has been aborted

TERMINATED 8

Trial has been terminated

ABORT_IN_PROGRESS 9

Trial abort is in progress

DseService

DSE service definition

Method NameRequest TypeResponse TypeDescription
CreateBinding CreateBindingRequest CreateBindingResponse

create rank and physical bindings for use with a Trial object in ConfigureTrial/RunTrial

ConfigureTrial Trial RunLogs stream

ConfigureTrial sets up the trial based on the parameters provided. If the low-level config is not included, the server will use the high-level spec to generate the corresponding low-level config, which is returned as part of the response.

AbortTrial AbortTrialRequest AbortTrialResponse

Aborts the currently running trial and receive the trial report aborted. Returns an error if no Trial has been configured

RunTrial .google.protobuf.Empty RunLogs stream

Run the currently configured trial and receive streaming updates. Returns an error if no Trial has been configured

GetTrial .google.protobuf.Empty Trial

Returns the trial that is currently configured. If no trial is currently configured, the returned object will be uninitialized.

GetTrialReport .google.protobuf.Empty TrialReport

TrialReport contains state information (not started, in progress, successful, error) Pattern is modeled after this: https://cloud.google.com/apis/design/design_patterns#long_running_operations although it does not match completely.

GetTrialReportDetails .google.protobuf.Empty .storage_v2.TrialReportDetailInfo

Returns the trial report of the most recently run trial. If no trial has been run, returns an empty message.

GetCollectiveImplementations .google.protobuf.Empty CollectiveImplementations

Returns a list of all (CC operation, algorithmic implementation) pairs available on the DSE server.

GetDiagnosticFile GetDiagnosticFileRequest GetDiagnosticFileResponse

Request to get diagnostic files for specified trials.

Echo .common.EchoRequest .common.EchoResponse

test api

common.proto

Top

common.proto

Common data models

Must immediately follow header comments (no blank lines)

Algorithm

Algorithm message specifies a choice of system provided Expanders or a user provided custom implementation

FieldTypeLabelDescription
system AlgorithmType

A system supplied collective algorithm

custom string

A path of format package.module.classname to a class that inherits from the keys_ai_ml_chakra.Expander. See the EXPANDERS.md in the keys_ai_ml_chakra package for details on how to create your own custom expander class.

flow_control_config FlowControl

Configuration for flow control mechanism used by this algorithm.

ChassisInfo

Information about the chassis being used for emulation

FieldTypeLabelDescription
address string

Chassis IP Address or FQDN

port string

Chassis port. Formats <front-panel-port> or <front-panel-port>.<fanout>. Examples: '1' or '1.4'

CollectiveImplementation

Used to specify a (collective type, collective algorithm) pair.

The collective algorithm determines how a collective communication operation is expanded into

a set of peer-to-peer operations.

FieldTypeLabelDescription
type keysight_chakra.mlcommons.CollectiveCommType

The type of collective communication operation.

algorithm Algorithm

A system-provided or user-provided algorithm specification used to expand a collective communication operation into a set of peer-to-peer operations.

CongestionControl

Congestion control mechanisms

FieldTypeLabelDescription
ecn ExplicitCongestionNotifications

ECN configuration

pfc PriorityFlowControl

PFC configuration

dcqcn_rate_control DCQCNRateControl

DCQCN rate control configuration

DCQCNRateControl

Data Center Quantized Congestion Notification rate control settings

FieldTypeLabelDescription
enabled bool optional

Enable/disable DCQCN

alpha_factor int32 optional

Factor to update Alpha every update period (fixed point fraction of 2^10)

alpha_interval int32

Alpha update period (microseconds)

initial_alpha int32

Initial Alpha value

rate_after_first_cnp int32

Current and target rate limit set after first CNP (Mbps)

rate_decrement_factor float optional

Maximal ratio of rate decrease in a single event (percentage)

min_rate_limit int32

Minimal rate limit of the QP (Mbps)

rate_decrement_coefficient int32

The coefficient between Alpha and the rate reduction factor

rate_decrement_interval int32

The minimum time period between rate reductions (microseconds)

clamp_target_rate bool optional

If enabled, whenever a CNP is processed, the target rate will be updated to the current rate

rate_increment_interval int32

The time period between rate increase events (microseconds)

rate_increment_byte_counter int32

The sent bytes counter between rate increase events (64B)

rate_increment_threshold int32

The threshold of rate increase events for moving to next rate increase phase

additive_rate_increment int32

The rate increase value in the Additive Increase phase (Mbps)

hyper_rate_increment int32

The rate increase value in the Hyper Increase phase (Mbps)

time_between_cnps int32

Minimal time between two consecutive CNPs sent (microseconds)

EchoRequest

A request to echo a simple message

FieldTypeLabelDescription
message string

The message to echo

EchoResponse

The echo response from a service

FieldTypeLabelDescription
message string

The echoed message

ExplicitCongestionNotifications

ECN configuration

FieldTypeLabelDescription
cnp_dscp int32 optional

DSCP of CNP packets

data_ecn_bits EcnBits optional

Configures the ECN bits for data packets

control_ecn_bits EcnBits optional

Configures the ECN bits for control packets; eg RoCEv2 ACKs

cnp_ecn_bits EcnBits optional

Configures the ECN bits for CNP packets

FalconTransport

BEGIN PRIVATE

FieldTypeLabelDescription
rdma_message_size int32

(Maximum) RMDA message size in Bytes

qps_per_rankpair int32

Number of Queue Pairs per rank pair

qp_negotiation RoCEv2QPNegotiationMethod

Queue Pair Negotiation method

verb RDMAVerb

RDMA verb to use for data transfers in collective communication operations

tcp_store_host string

TCPStore hostname or IP address (only applicable when qp_negotiation is METHOD_TCP_STORE)

tcp_store_port uint32

TCPStore port number (only applicable when qp_negotiation is METHOD_TCP_STORE)

FlowControl

Used to specify details needed to enable flow control feature

FieldTypeLabelDescription
enable_flow_control bool

Enable flow control mechanism where receiver grants credits to sender to control the data flow.

max_inflight_credits uint32

Maximum number of credits allowed in flight.

compute_delay uint32

Delay between data reception and credit transmission (microseconds)

credit_distribution FlowControlDistributionType

Determines how credits are distributed across streams. One credit can either unblock one stream at a time or all streams simultaneously

Ipv4Addressing

IPv4 Addressing configuration

FieldTypeLabelDescription
ip_address string

IP address

ip_prefix uint32

IP prefix

ip_gateway_address string

Gateway IP address

Ipv6Addressing

IPv6 Addressing configuration

FieldTypeLabelDescription
ip_address string

IP address

ip_prefix uint32

IP prefix

ip_gateway_address string

Gateway IP address

IxperfConfig

BEGIN PRIVATE

FieldTypeLabelDescription
custom_env_vars IxperfConfig.CustomEnvVarsEntry repeated

Custom Ixperf environment variables

IxperfConfig.CustomEnvVarsEntry

FieldTypeLabelDescription
key string

value string

Layer1

Layer 1 configuration

FieldTypeLabelDescription
speed_mode SpeedMode

Speed, modulation and FEC mode

auto_negotiate bool

Enable/disable auto negotiation

link_training bool

Enable/disable link training

ieee_defaults bool

Enable/disable IEEE Defaults. This setting takes precedence over auto-negotiation and link training

NcclConfig

NCCL configuration parameters

FieldTypeLabelDescription
custom_env_vars NcclConfig.CustomEnvVarsEntry repeated

Custom NCCL environment variables

NcclConfig.CustomEnvVarsEntry

FieldTypeLabelDescription
key string

value string

NicSettings

Network Interface Card settings

FieldTypeLabelDescription
ethernet_mtu int32

Ethernet Maximum transmission unit

ipv4_addressing Ipv4Addressing

IPv4 Addressing specifics

ipv6_addressing Ipv6Addressing

IPv6 Addressing specifics

qos Qos optional

BEGIN PRIVATE Quality of service

congestion_control CongestionControl optional

END PRIVATE Congestion control

packet_capture PacketCapture optional

BEGIN PRIVATE Packet capture END PRIVATE

mac_address string optional

MAC address

vlan Vlan optional

VLAN Tags

roce_transport_settings Rocev2TransportSettings

ROCEv2 Transport specific settings

PacketCapture

BEGIN PRIVATE

Packet capture configuration

FieldTypeLabelDescription
enabled bool

Enable/disable packet capture

capture_max_file_size uint32 optional

Size of the capture file size in bytes.

buffer_full_action BufferFullAction optional

What to do when the capture buffer is full

packet_slice_size uint32 optional

Size of each packet slice in bytes. If set to 0, the entire packet is captured

PriorityFlowControl

PFC configuration

FieldTypeLabelDescription
enabled bool optional

Enable/disable Priority Flow Control

priorities int32 repeated

List of priorities

Qos

BEGIN PRIVATE

Quality of Service configuration

FieldTypeLabelDescription
priority_trust_mode PrioTrustMode

Pirority Trust mode

map_dscp_to_prio Qos.MapDscpToPrioEntry repeated

DSCP to priority mapping

map_prio_to_traffic_class Qos.MapPrioToTrafficClassEntry repeated

Priority to traffic class mapping

Qos.MapDscpToPrioEntry

FieldTypeLabelDescription
key int32

value int32

Qos.MapPrioToTrafficClassEntry

FieldTypeLabelDescription
key int32

value int32

Rocev2Transport

RoCEv2 Transport configuration parameters

FieldTypeLabelDescription
rdma_message_size uint32

(Maximum) RMDA message size in Bytes

qps_per_rankpair int32

Number of Queue Pairs per rank pair

qp_negotiation RoCEv2QPNegotiationMethod

Queue Pair Negotiation method

verb RDMAVerb

RDMA verb to use for data transfers in collective communication operations

tcp_store_host string

TCPStore hostname or IP address (only applicable when qp_negotiation is METHOD_TCP_STORE)

tcp_store_port uint32

TCPStore port number (only applicable when qp_negotiation is METHOD_TCP_STORE)

retx_retry_interval_ms int32 optional

The AckTimeout for RoCEv2 protocol

retx_retry_count int32 optional

The RetransRetryCount for RoCEv2 protocol

max_retry_on_rnr_nak int32 optional

Triggers retransmission when RNR NACK is received

ack_request_interval int32 optional

Request ACK after every N packets

reuse_qps bool optional

Enable/disable reuse of Queue Pairs (QPs) for RDMA messages between a rank pair

support_rx_reordering bool optional

Enable/disable the Rx Reordering

Rocev2TransportSettings

RoCEv2 Transport specific settings

FieldTypeLabelDescription
ack_dscp int32 optional

Configures ACK DSCP value for RoCEv2 protocol

nack_dscp int32 optional

Configures NACK DSCP value for RoCEv2 protocol

data_dscp int32 optional

Configures DSCP for all data traffic

ServerInfo

Information about servers being used for emulation

FieldTypeLabelDescription
address string

Server IP address or FQDN

nic_interface string

Server NIC interface

TcpTransport

TCP Transport configuration parameters

Vlan

VLAN Configuration

FieldTypeLabelDescription
enabled bool

Enable/Disable VLAN

vlan_tags VlanTag repeated

Currently one vlan tag is required.

VlanTag

VLAN Configuration

FieldTypeLabelDescription
priority int32

VLAN Priority

vlan_id int32

VLAN ID

AlgorithmType

Algorithm message specifies a choice of system provided Expanders or a user provided custom implementation

NameNumberDescription
ALGO_UNSPECIFIED 0

Algorithm type not specified

ALGO_ALL_TO_ALL_PARALLEL 1

All To All collective with chunk transfer in parallel

ALGO_ALL_TO_ALL_PXN 2

All To All collective that leverages message aggregation and rail-optimized topologies.

ALGO_ALL_REDUCE_UNIDIRECTIONAL_RING 10

All Reduce collective with chunk transfer in a unidirectional ring

ALGO_ALL_REDUCE_BIDIRECTIONAL_RING 11

All Reduce collective with chunk transfer in a bidirectional ring

ALGO_ALL_REDUCE_VECTOR_HALVING_DOUBLING 12

All Reduce collective using a vector halving doubling algorithm

ALGO_ALL_REDUCE_DOUBLE_BINARY_TREE 13

All Reduce collective with chunk transfer in a double binary tree

ALGO_ALL_GATHER_RING 20

All Gather collective with chunk transfer in a unidirectional ring

ALGO_REDUCE_SCATTER_RING 30

Reduce Scatter collective with chunk transfer in a unidirectional ring

ALGO_BROADCAST_PARALLEL 40

Broadcast collective with chunk transfer in parallel

ALGO_GATHER_PARALLEL 50

Gather point to point collective (all to one) with chunk transfer in parallel

ALGO_ALL_TO_ALL_PARALLEL_PP 60

All To All collective with chunk transfer in parallel MSCCL++ implementation

ALGO_ALL_REDUCE_UNIDIRECTIONAL_RING_PP 70

All Reduce collective with chunk transfer in a unidirectional ring MSCCL++ implementation

BufferFullAction

Action to take when the packet capture buffer is full

NameNumberDescription
BUFFER_ACTION_OVERRIDE 0

Override the current buffer and save its contents

BUFFER_ACTION_STOP 1

Stop the capture and dump the data to a file

EcnBits

ECN field within an IP packet

NameNumberDescription
ECN_UNSPECIFIED 0

ECN bits not specified

ECN_DISABLED 1

00 - Non-ECT - Packets are marked as not ECN-capable

ECN_ECT1 2

01 - ECT(1) - ECN-capable transport

ECN_ECT0 3

10 - ECT(0) - ECN-capable transport

FlowControlDistributionType

Defines the way credits are distributed when flow control feature is used

NameNumberDescription
FLOW_CONTROL_DISTRIBUTION_TYPE_UNSPECIFIED 0

Credit distribution type not specified

ALL_STREAMS 1

Credits distributed to all available streams.

SINGLE_STREAM 2

Credits distributed to a single stream at a time.

PlatformType

Platform types available for executing workloads

NameNumberDescription
PLATFORM_UNSPECIFIED 0

Platform type not specified

PLATFORM_KEYS_NCCL_TEST 1

Use Keysight orchestrated NCCL test to execute the workload

PLATFORM_KEYS_SW_AGENT 2

Use Keysight software agent to execute the workload

PLATFORM_KEYS_HW 3

Use Keysight hardware to execute the workload

PLATFORM_EXTERNAL 10

EXPERIMENTAL External Platform

SCP_SIMULATION 50

use 50 as base for Simulation Platform Use SCP Simulation platform to execute the workload

PrioTrustMode

BEGIN PRIVATE

Configures trust state

NameNumberDescription
TRUST_UNSPECIFIED 0

Priority trust mode not specified

TRUST_DSCP 1

L3 trust, based on Differentiated Services Code Point Differentiated Services Code Point

RDMAVerb

RDMA verbs supported for data transfers

NameNumberDescription
VERB_UNSPECIFIED 0

RDMA verb not specified

VERB_WRITE 1

RDMA Verb Write

VERB_SEND 2

RDMA Verb Send

RoCEv2QPNegotiationMethod

Methods available for RoCEv2 Queue Pair negotiation

NameNumberDescription
METHOD_UNSPECIFIED 0

Queue Pair negotiation method not specified

METHOD_KEYS_PROPRIETARY 1

Negotiate queue pairs using Keysight proprietary implementation

METHOD_TCP 2

Negotiate queue pairs using TCP

METHOD_TCP_STORE 3

Negotiate queue pairs using a TCP-based store implementation

METHOD_RDMA_CM 4

Negotiate queue pairs using RDMA communication management implementation

METHOD_AUTOMATIC 5

Use the default negotiation method for the selected platform

RoCEv2RNRTimeout

Relevant only for RC QPs

NameNumberDescription
TIMEOUT_655360_MU 0

Receiver Not Ready Timeout 655.36 milliseconds delay

TIMEOUT_10_MU 1

Receiver Not Ready Timeout 0.01 milliseconds delay

TIMEOUT_20_MU 2

Receiver Not Ready Timeout 0.02 milliseconds delay

TIMEOUT_30_MU 3

Receiver Not Ready Timeout 0.03 milliseconds delay

TIMEOUT_40_MU 4

Receiver Not Ready Timeout 0.04 milliseconds delay

TIMEOUT_60_MU 5

Receiver Not Ready Timeout 0.06 milliseconds delay

TIMEOUT_80_MU 6

Receiver Not Ready Timeout 0.08 milliseconds delay

TIMEOUT_120_MU 7

Receiver Not Ready Timeout 0.12 milliseconds delay

TIMEOUT_160_MU 8

Receiver Not Ready Timeout 0.16 milliseconds delay

TIMEOUT_240_MU 9

Receiver Not Ready Timeout 0.24 milliseconds delay

TIMEOUT_320_MU 10

Receiver Not Ready Timeout 0.32 milliseconds delay

TIMEOUT_480_MU 11

Receiver Not Ready Timeout 0.48 milliseconds delay

TIMEOUT_640_MU 12

Receiver Not Ready Timeout 0.64 milliseconds delay

TIMEOUT_960_MU 13

Receiver Not Ready Timeout 0.96 milliseconds delay

TIMEOUT_1280_MU 14

Receiver Not Ready Timeout 1.28 milliseconds delay

TIMEOUT_1920_MU 15

Receiver Not Ready Timeout 1.92 milliseconds delay

TIMEOUT_2560_MU 16

Receiver Not Ready Timeout 2.56 milliseconds delay

TIMEOUT_3840_MU 17

Receiver Not Ready Timeout 3.84 milliseconds delay

TIMEOUT_5120_MU 18

Receiver Not Ready Timeout 5.12 milliseconds delay

TIMEOUT_7680_MU 19

Receiver Not Ready Timeout 7.68 milliseconds delay

TIMEOUT_10240_MU 20

Receiver Not Ready Timeout 10.24 milliseconds delay

TIMEOUT_15360_MU 21

Receiver Not Ready Timeout 15.36 milliseconds delay

TIMEOUT_20480_MU 22

Receiver Not Ready Timeout 20.48 milliseconds delay

TIMEOUT_30720_MU 23

Receiver Not Ready Timeout 30.72 milliseconds delay

TIMEOUT_40960_MU 24

Receiver Not Ready Timeout 40.96 milliseconds delay

TIMEOUT_61440_MU 25

Receiver Not Ready Timeout 61.44 milliseconds delay

TIMEOUT_81920_MU 26

Receiver Not Ready Timeout 81.92 milliseconds delay

TIMEOUT_122880_MU 27

Receiver Not Ready Timeout 122.88 milliseconds delay

TIMEOUT_163840_MU 28

Receiver Not Ready Timeout 163.84 milliseconds delay

TIMEOUT_245760_MU 29

Receiver Not Ready Timeout 245.76 milliseconds delay

TIMEOUT_327680_MU 30

Receiver Not Ready Timeout 327.68 milliseconds delay

TIMEOUT_491520_MU 31

Receiver Not Ready Timeout 491.52 milliseconds delay

SpeedMode

Speed, modulation and FEC mode

NameNumberDescription
UNSPECIFIED 0

Speed mode not specified

MODE_100GE_NRZ_RS_FEC 9

100GE port speed with NRZ modulation and RS FEC

MODE_100GE_NRZ_NO_FEC 10

100GE port speed with NRZ modulation and no FEC

MODE_100GE_PAM4_53G_KP4_FEC 8

100GE port speed with 53G PAM4 modulation and KP4 FEC

MODE_100GE_PAM4_106G_RS_FEC 7

100GE port speed with 106G PAM4 modulation and RS FEC

MODE_100GE_PAM4_106G_KP4_FEC 6

100GE port speed with 106G PAM4 modulation and KP4 FEC

MODE_200GE_PAM4_53G_KP4_FEC 5

200GE port speed with 53G PAM4 modulation and KP4 FEC

MODE_200GE_PAM4_106G_KP4_FEC 4

200GE port speed with 106G PAM4 modulation and KP4 FEC

MODE_400GE_PAM4_53G_KP4_FEC 3

400GE port speed with 53G PAM4 lanes and KP4 FEC

MODE_400GE_PAM4_106G_KP4_FEC 2

400GE port speed with 106G PAM4 lanes and KP4 FEC

MODE_800GE_PAM4_106G_KP4_FEC 1

800GE port speed with 106G PAM4 lanes and KP4 FEC

SpeedType

Network interface speed types

NameNumberDescription
SPEED_UNSPECIFIED 0

Speed type not specified

SPEED_100G 1

100 Gigabit Ethernet

SPEED_200G 2

200 Gigabit Ethernet

SPEED_400G 3

400 Gigabit Ethernet

SPEED_800G 4

800 Gigabit Ethernet

app_common.proto

Top

app_common.proto

Common data types and artifacts produced during trial execution

Must immediately follow header comments (no blank lines)

SummaryTable

Summary table (nccl-like), available after a trial has finished running

Uses schema since this table will be included in TrialReport for immediate

access in addition to being available in storage as a feather file.

FieldTypeLabelDescription
summary_rows table.TableRow repeated

Rows of summary data, one per collective operation

summary_schema table.TableSchema

Schema defining columns in the summary table

TrialArtifacts

List of artifacts (absolute file paths) produced by a single trial run

and saved to persistent storage.

Table schema for each data table.

FieldTypeLabelDescription
configuration_files string repeated

Paths to configuration files used in the trial

summary_data_files string repeated

Paths to summary data files

port_metrics_data_files string repeated

Paths to port-level metrics files

fabric_metrics_data_files string repeated

Paths to fabric-level metrics files

pcap_files string repeated

Paths to packet capture files

custom_files string repeated

Paths to custom user-generated files

error_msg_files string repeated

Paths to error message log files

flow_metrics_data_files string repeated

Paths to flow-level metrics files

data_chunk_data_files string repeated

Paths to data chunk metrics files

impairment_metrics_data_files string repeated

Paths to impairment metrics files

qp_metrics_data_files string repeated

Paths to queue pair metrics files

datasize_breakdown_metrics_data_files string repeated

Paths to data size breakdown metrics files

iteration_metrics_data_files string repeated

Paths to per-iteration metrics files

emulation_summary_metrics_data_files string repeated

Paths to emulation summary metrics files

workload_details_data_files string repeated

Paths to workload detail files

packet_drop_metrics_data_files string repeated

Paths to packet drop metrics files

packet_reorder_metrics_data_files string repeated

Paths to packet reorder metrics files

ArtifactType

Types of artifacts produced during trial execution.

Classifies output files by their content and purpose.

NameNumberDescription
ART_UNSPECIFIED 0

Artifact type not specified

ART_CONFIGURATION 1

Configuration files used for the trial

ART_SUMMARY_DATA 2

High-level summary data and statistics

ART_PORT_METRICS_DATA 5

Per-port performance metrics

ART_FABRIC_METRICS_DATA 6

Fabric-wide performance metrics

ART_PCAP 7

Packet capture files (pcap format)

ART_CUSTOM 8

User-defined custom artifact files

ART_ERROR_MSG 9

Error messages and diagnostics

ART_DATA_CHUNK_DATA 10

Data chunk transfer metrics

ART_FLOW_METRICS_DATA 11

Per-flow communication metrics

ART_IMPAIRMENT_METRICS_DATA 12

Network impairment application metrics

ART_QP_METRICS_DATA 13

Queue pair (QP) level metrics

ART_DATASIZE_BREAKDOWN_METRICS_DATA 14

Message size distribution breakdown

ART_ITERATION_METRICS_DATA 15

Per-iteration performance metrics

ART_EMULATION_SUMMARY_METRICS_DATA 16

Overall emulation summary statistics

ART_WORKLOAD_DETAILS_DATA 17

Detailed workload execution information

ART_PACKET_DROP_METRICS_DATA 18

Packet drop impairment metrics

ART_PACKET_REORDER_METRICS_DATA 19

Packet reorder impairment metrics

dse_infra.proto

Top

dse_infra.proto

data models for dse infrastructures to be used by applications

Must immediately follow header comments (no blank lines)

BlackboxFabric

Describes a fabric where the internal topology is not modeled.

FieldTypeLabelDescription
name string optional

The name identifier for this blackbox fabric instance. fabric name

ClosFabric

BEGIN PRIVATE

FieldTypeLabelDescription
host_count uint32

The number of hosts in the fabric. This value is used to determine the number of racks and switches needed in the fabric based on the tier configurations.

host_nic_speed InfraBandwidth

Gap in field numbers lets us add this later without breaking compatibility. uint32 host_nic_radix = 6;

host_max_ports_up_to_peer MaxPortsUpToPeer

The maximum number of ports from each host up to its peer switch.

host_wiring_schema HostWiringSchema

The schema for the host wiring.

spine_wiring_schema SpineWiringSchema

The schema for the spine wiring.

tier_configs TierConfig repeated

The configurations for each tier in the fabric. The first element in the list corresponds to the leaf tier, the second to the tier above it, and so on.

Fabric

Describes the overall fabric topology.

FieldTypeLabelDescription
blackbox BlackboxFabric

non-detailed generic fabric (a blackbox)

clos ClosFabric

Multi-stage Clos fabric

rackplane RackPlaneFabric

BEGIN PRIVATE EXPERIMENTAL: network fabric featuring multiple planes within a rack

GenericHost

contains the input parameters for the generic host builder

FieldTypeLabelDescription
name string optional

The name of the generic host.

npu_count uint32

The number of Neural Processing Units (NPUs) in the host. N.B. generic hosts have a 1:1 ratio of nics and gpus

custom_bandwidth_gbps uint32

The custom bandwidth for the generic host in Gbps.

nvlink_version_bandwidth NVLinkVersionBandwidth

BEGIN PRIVATE The bandwidth of NVLink by version.

Host

Host message is a high level specification of a low level device

The Trial object returned by CreateTrial() includes a representation of each individual device.

FieldTypeLabelDescription
count uint32

The number of hosts in the fabric.

zionex ZionexHost

Zionex host type.

generic GenericHost

Generic host type.

rackplane RackPlaneHost

EXPERIMENTAL

use_npu_interconnect bool

END PRIVATE Whether to use NPU interconnect for the host.

InfraBandwidth

BEGIN PRIVATE

The bandwidth specification for infrastructure components.

FieldTypeLabelDescription
custom_gbps uint32

A custom bandwidth for the infrastructure in Gbps.

fabric_speed common.SpeedType

The speed of the infrastructure fabric.

Infrastructure

infrastructure configuration comprising hosts and the network fabric

FieldTypeLabelDescription
host Host

The type of host used in the infrastructure.

fabric Fabric optional

Specifies thes fabric topology used in the infrastructure.

MaxPortsDownFromDevice

BEGIN PRIVATE

FieldTypeLabelDescription
port_count uint32

The maximum number of ports under a device.

use_half bool

Whether to use half the maximum ports.

MaxPortsUpToPeer

BEGIN PRIVATE

FieldTypeLabelDescription
port_count uint32

The maximum number of ports from each host up to its peer switch.

use_max bool

Whether to use the maximum number of ports.

OversubscriptionRatio

BEGIN PRIVATE

Represents an oversubscription ratio with the form :

FieldTypeLabelDescription
downlink_factor uint32

The oversubscription ratio for downlink traffic.

uplink_factor uint32

The oversubscription ratio for uplink traffic.

RackPlaneFabric

EXPERIMENTAL: A network fabric featuring multiple planes within a rack

Using a RackPlaneFabric requires the use of a RackPlaneHost.

The number of rack-switches for intra-rack communication is driven by RackPlaneHost.scale_up_nic_count.

The number of scale-out switches for inter-rack communication is driven by

rack_count and RackPlaneHost.scale_out_nic_count and hosts_per_rack.

FieldTypeLabelDescription
rack_count uint32 optional

The number of racks in the fabric.

hosts_per_rack uint32 optional

The number of hosts per rack.

scale_out_switch_speed common.SpeedType

Ethernet speed for inter-rack communication via scale-out switches.

scale_up_switch_speed common.SpeedType

Ethernet speed for intra-rack communication on the scale-up network.

RackPlaneHost

Host type intended to be used in conjunction with RackPlaneFabric.

It connects to each of the RackPlaneFabric planes with a dedicated scale up NIC.

FieldTypeLabelDescription
npu_count uint32 optional

The number of NPUs in the host.

scale_up_nic_count uint32 optional

The number of NICs for scale-up traffic.

scale_out_nic_count uint32 optional

The number of NICs for scale-out traffic.

name string optional

The name identifier for this RackPlane host instance.

TierConfig

BEGIN PRIVATE

FieldTypeLabelDescription
switch_radix uint32

The radix of the switch in the fabric.

oversubscription_ratio OversubscriptionRatio

The oversubscription ratio for this tier, expressed as downlink_factor:uplink_factor. This determines the ratio of downward-facing to upward-facing port bandwidth.

port_speed InfraBandwidth

The port speed (bandwidth) for all switch ports on this tier. Can be specified as a custom value in Gbps or as a standard fabric speed.

max_ports_up_to_peer MaxPortsUpToPeer

The maximum number of ports that can connect from a switch on this tier to a single switch on the next higher tier. Controls link aggregation between tiers.

max_ports_down_from_device MaxPortsDownFromDevice

The maximum number of ports on each switch that can connect downward to the next lower tier. Typically uses half the switch radix by default.

ZionexHost

BEGIN PRIVATE

zionex host builder takes no inputs

FieldTypeLabelDescription
name string optional

The name identifier for this Zionex host instance.

HostWiringSchema

BEGIN PRIVATE

NameNumberDescription
HOST_WIRING_SCHEMA_UNSPECIFIED 0

Undeclared wiring schema.

FAIR_DISTRIBUTION 1

Fair distribution wiring schema. Connect each host to a different rack switch in round-robin fashion until all hosts have been connected.

LEFT_TO_RIGHT 2

Left to right wiring schema. Connect hosts to rack switches from left to right. Move to next switch only after a switch's capacity has been filled.

RAIL_OPTIMIZED 3

Rail optimized wiring schema. The nth NICs of all hosts are connected to the same switches.

NVLinkVersionBandwidth

BEGIN PRIVATE

The bandwidth of NVLink by version.

NameNumberDescription
NVLINK_VERSION_BANDWIDTH_UNSPECIFIED 0

The bandwidth for the specificed NVLink version is not specified.

SpineWiringSchema

BEGIN PRIVATE

NameNumberDescription
SPINE_WIRING_SCHEMA_UNSPECIFIED 0

The schema for the spine wiring is not specified.

FULLY_CONNECTED 1

The spines are fuly connected. Every spine switch is connected to every switch in the previous tier.

SPINE_SETS 2

The spines are partially connected in sets as determined by the number of hosts, tiers, and switch radix constraints. The spine layer consists of of one or more spine sets such that every switch within a spine set is connected to the nth switch of every connectivity group in the previous tier. Example of this wiring scheme can be seen in this article: https://packetpushers.net/blog/demystifying-dcn-topologies-clos-fat-trees-part2/

kccb.proto

Top

kccb.proto

Data models and apis for describing a Keysight Collective Communication Benchmark

Must immediately follow header comments (no blank lines)

Benchmark

Benchmark message is a high level abstraction of a Chakra workload

The message MUST be converted into a list of Chakra et_def.proto Node messages

and the dse.Experiment.workload field should be populated with the

converted list

FieldTypeLabelDescription
collective_algorithm common.Algorithm

communication collective algorithm

datasize Datasize

A data size iterator with start, step, and end values

datasize_list DatasizeList

A list of data size definitions

iterations int32

Number of benchmark runs per data size

channels channels.ChannelsTopology

The channels topology configuration specifying how ranks are distributed and connected.

iteration_append_delay int32 optional

The delay (in ms) between repeated executions of the benchmark.

Config

High-level KCCB spec for a trial.

FieldTypeLabelDescription
benchmark Benchmark

The benchmark configuration defining the collective communication test.

Datasize

Datasize message is a container for specifying the data sizes for each

benchmark collective run

FieldTypeLabelDescription
start uint64

The initial data size of a benchmark collective operation.

step uint32

The amount to increment the data size between iterations

end uint64

maximum data size after which the benchmark completes

DatasizeList

DatasizeList allows for specifying a custom list of data sizes

FieldTypeLabelDescription
size_bytes uint64 repeated

the size of data in bytes

bind.proto

Top

binding.proto

Protocol Buffers definitions for binding logical infrastructure elements to ranks, nics, physical resources.

Must immediately follow header comments (no blank lines)

Binding

Binds various settings to logical infrastructure elements.

The type of bound information depends on the type of binding.

FieldTypeLabelDescription
custom_binding CustomBinding

Custom binding configuration that maps ranks to NPUs, configures NIC settings, and assigns physical test resources.

infrastructure_profile profiles.InfraProfile

the Binding keeps a copy of the infrastructure profile so that Trials can be re-run with the original versions of (potentially modified) profiles.InfraProfile

infrastructure keysight_chakra.infra.Infrastructure

low-level chakra infrastructure matching infrastructure_profile

infra_annotations keysight_chakra.infra.Annotation repeated

Annotations for low-level chakra infrastructure.

platform_regions PlatformRegion repeated

Assigns platforms to different regions of the infrastructure. Should be populated by the server and be a copy of the platform_regions sent in CreateBindingRequest.

CustomBinding

Binds:

a) Ranks to Logical NPUs

b) NIC settings to Logical NICs

c) assigned test resources

FieldTypeLabelDescription
rank_bindings RankBinding repeated

List of bindings that map each rank to a specific logical NPU and its available NICs in the infrastructure.

nic_bindings NicBinding repeated

List of NIC configuration bindings that specify settings and associated physical bindings for each logical NIC.

physical_bindings PhysicalBinding repeated

List of bindings that associate logical infrastructure elements with physical test resources such as chassis or servers.

InfraRef

Reference to a logical infrastructure element

FieldTypeLabelDescription
device_instance_name string

name of the logical infrastructure device (e.g. the value dse_infra.GenericHost.name was set to)

device_index int32

0-based device index

component_name string

name of the logical infrastructure component

component_index int32

0-based component index

InfraRegion

Defines a region within the infrastructure by specifying boundary components.

Used to group infrastructure elements for platform assignment or other

purposes.

FieldTypeLabelDescription
boundary_refs InfraRef repeated

These components mark the boundary of a region within the infrastructure.

NicBinding

Logical NIC Settings

FieldTypeLabelDescription
infra_ref InfraRef

ref to a NIC defined in the chakra infrastructure

nic_settings common.NicSettings

auto-populated, can be overriden by user

associated_physical_bindings InfraRef repeated

flows from this nic may be generated by test ports from any of these associated physical bindings

PhysicalBinding

Binds a logical infrastructure element to a physical test resource,

specifying the platform type, physical location, and layer 1 settings.

FieldTypeLabelDescription
infra_ref InfraRef

reference to the logical infrastructure element that the physical resource represents

platform common.PlatformType

The platform type for this physical binding (e.g., hardware chassis or software server).

chassis_location common.ChassisInfo

with Keysight Hardware platforms

server_location common.ServerInfo

with NCCL + Keysight Software platforms

layer1 common.Layer1

Layer 1 physical settings such as link speed, duplex mode, and physical layer parameters.

capture common.PacketCapture optional

BEGIN PRIVATE packet capture settings END PRIVATE

PlatformRegion

Assigns a platform to a region in the infrastructure.

Currently for internal use only.

TODO: Links to sample scripts demonstrating usage.

FieldTypeLabelDescription
region InfraRegion

The infrastructure region to which a platform will be assigned. Defines the boundary components that comprise the region.

platform common.PlatformType optional

The platform type to assign to this infrastructure region for testing.

RankBinding

Assign Ranks to Logical NPUs

FieldTypeLabelDescription
infra_ref InfraRef

ref to NPU used by this rank

rank_id int32

The unique identifier for this rank in the distributed workload. Specifies which rank is bound to the referenced NPU.

nic_refs InfraRef repeated

list of NICs available to NPU (in the same host)

Scalar Value Types

.proto TypeNotesC++JavaPythonGoC#PHPRuby
double double double float float64 double float Float
float float float float float32 float float Float
int32 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. int32 int int int32 int integer Bignum or Fixnum (as required)
int64 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. int64 long int/long int64 long integer/string Bignum
uint32 Uses variable-length encoding. uint32 int int/long uint32 uint integer Bignum or Fixnum (as required)
uint64 Uses variable-length encoding. uint64 long int/long uint64 ulong integer/string Bignum or Fixnum (as required)
sint32 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. int32 int int int32 int integer Bignum or Fixnum (as required)
sint64 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. int64 long int/long int64 long integer/string Bignum
fixed32 Always four bytes. More efficient than uint32 if values are often greater than 2^28. uint32 int int uint32 uint integer Bignum or Fixnum (as required)
fixed64 Always eight bytes. More efficient than uint64 if values are often greater than 2^56. uint64 long int/long uint64 ulong integer/string Bignum
sfixed32 Always four bytes. int32 int int int32 int integer Bignum or Fixnum (as required)
sfixed64 Always eight bytes. int64 long int/long int64 long integer/string Bignum
bool bool boolean boolean bool bool boolean TrueClass/FalseClass
string A string must always contain UTF-8 encoded or 7-bit ASCII text. string String str/unicode string string string String (UTF-8)
bytes May contain any arbitrary sequence of bytes. string ByteString str []byte ByteString string String (ASCII-8BIT)