lssndrbarbieri@gmail.com Alessandro Barbieri The Datatype Comparison (DTCMP) Library provides pre-defined and user-defined comparison operations to compare the values of two items which can be arbitrary MPI datatypes. Using these comparison operations, the library provides various routines for manipulating data, which may be distributed over the processes of an MPI communicator including: search - search for a target value in an ordered list of values merge - combine multiple ordered lists into a single ordered list partition - divide a list of items into lower and higher values around a specified pivot value select - identify the kth largest value sort - sort data items into an ordered list rank - assign group ids and ranks to a list of items scan - execute a segmented scan on an ordered list of values The DTCMP library is designed to provide a high-level interface to the above functionality. These high-level routines will invoke various algorithm implementations to achieve the desired output. The goal of DTCMP is to be efficient given the input and the data distribution among processes. It is also intended to be portable to different platforms and to allow for easy addition of new algorithms over time. While performance is important, the goal is not to provide the fastest routines. The generality provided by the DTCMP API that makes portability possible also tends to reduce performance in some respects, e.g., forcing memory copies, abstracting some details about datatype, etc. Most likely a hand-tuned algorithm for the precise problem at hand will always be faster than DTCMP. However, DTCMP should be fast, efficient, and portable, so it will generally be a good option except for those cases where the application bottleneck demands absolute performance. Currently, the following pre-defined comparison operations are provided. More will be added with time. All pre-defined operations have the following naming convention: DTCMP_OP_TYPE_DIRECTION where TYPE may be one of: SHORT - C short INT - C int LONG - C long LONGLONG - C long long UNSIGNEDSHORT - C unsigned short UNSIGNEDINT - C unsigned int UNSIGNEDLONG - C unsigned long UNSIGNEDLONGLONG - C unsigned long long INT8T - C int8_t INT16T - C int16_t INT32T - C int32_t INT64T - C int64_t UINT8T - C uint8_t UINT16T - C uint16_t UINT32T - C uint32_t UINT64T - C uint64_t FLOAT - C float DOUBLE - C double LONGDOUBLE - C long double and DIRECTION may be one of: ASCEND - order values from smallest to largest DESCEND - order values from largest to smallest Often when sorting data, each item contains a "key" that determines its position within the global order and a "value", called "satellite data", which travels with the key value but has no affect on its order. DTCMP assumes that satellite data is relatively small and is attached to the key in the same input buffer. In many DTCMP routines, one must specify the datatype for the key and another datatype for the key with its satellite data. The first is often named "key" and the second "keysat". The key datatype is used to infer the type and size of the key when comparing key values. This can be exploited to select optimized comparison routines, e.g., radix sort on integers, and it enables the library to siphon off and only process the key component if needed. The keysat type is needed to copy full items in memory or transfer items between processes. https://github.com/LLNL/dtcmp/issues LLNL/dtcmp