Deep rejoining model and dataset of oracle bone fragment images

Introduction
As a precious cultural heritage of the Chinese nation, oracle bone inscriptions record crucial historical information from ancient China dating back over 3000 years and are of great significance for the protection and inheritance of excellent traditional Chinese culture. However, when these oracle bones were excavated or transported, they were fragmented into numerous small pieces, and approximately 160,000 oracle bones are scattered across many institutions in more than twenty countries and regions, making physical reunification of them infeasible, as a result, rejoining the fragment has become an important research direction1. To rejoin oracle bone fragments, restoration experts must have knowledge of ancient Chinese calligraphy and oracle bone morphology, as well as memorize a large amount of information on oracle bone fragments2, and images of rubbing of oracle bone fragments are used to rejoin, but rubbing is a cultural relic damage to oracle bones. However, in the past 120 years, only 6400 oracle bone fragment records have been artificially reassembled, each record is a combination of fragments of one object3, as shown in Fig. 1, and manual rejoining of so many fragments is a low efficient work4,5.

ZLC_797 is the source image, ZLC_799 is target image. In the first stage, the matching edge segments of two images are achieved by LSES algorithm, the target area image is cut from the red box in the complete image rejoined by the CIR algorithm. In the second stage, the feature of the target area image would be input to machine learning or deep learning model to classify and calculate its texture continuity. The category have two labels: rejoinable and unrejoinable.
Related work
Existing methods
Image restoration has four situations: images that have templates like money-note fragments6,7, images that overlap like image stitch, regular shape images like jigsaws, and irregular images of object fragments like oracle bone fragment images8,9,10,11,12,13,14,15,16,17. Rejoining object fragment images is a noteworthy research topic, and new methods in the restoration field have been frequently introduced in recent years. As oracle bone fragment images have been collected from various institutions, researchers1,2,3 have turned their attention to rejoin them based on their images since the 1970s, non-contact restoration can avoid causing secondary damage to the oracle bone fragment.
Among local edge matching methods, a matching boundary segment is computed between each pair of fragments by solving the longest common subsequence (LCS), and a multi-piece alignment is used to prune incorrect matches to compose the final image10. The polygon feature matching method searches adjacent fragment pairs using the improved local matching method, and the degree of matching of each pair is measured, then a new path generation method is given based on the matching angle by generating the global path and reassembling the fragments according to the path, and finally realizing the restoration of multifragments14. The longitudinal shredded paper method achieves the splicing and restoration of broken paper by combining the degree matrix and the similarity matrix of the matching information, which are calculated according to the continuity between texts15. 2D fragment assembly method11 uses the earth mover distance, based on the length/property correspondence, to measure similarity, which potentially matches a point on the first contour to a desirable destination point on the second contour, and a greedy algorithm is used for 2D fragment assembly by assembling two neighboring fragments into a composite one. The methods require high edge consistency of object fragments, which is suitable for paper document fragments, but not for oracle bone fragment images of edge breakage.
Among deep learning methods for rejoining object fragment images18,19,20,21,22, the shortest path optimization method18,19 uses a neural network to predict the position of an archeological fragment, and a graph leading to the best restoration is made from these predictions. Shredded image restoration22 uses CNN to detect the compatibility of pairwise stitching and prune computed pairwise matches, applying a boost training and loop closure algorithm to improve the network’s efficiency and accuracy. A Siamese network combines a residual network with a spatial pyramid pool to match excavated wooden fragment images17. A fragment-matching approach based on pairwise local assemblies uses a 2D siamese neural network12 to evaluate the probability of matching of Ostraca fragments; the network is designed to predict the existence or absence of a set of matches, simultaneously, and the spatial relationship of one fragment with the other. A solution based on a Graph Neural Network13 uses pairwise patch information to assign labels to edges representing spatial relationships; this network classifies the relationship between a source and a target patch as being one of Up, Down, Left, Right, or None, and by doing so for all edges, the model outputs a new graph representing a reconstruction proposal. Most models provide a reorganization scheme in cultural relic fragments without texture similarity of object fragments.
Dataset situation
The restoration of the oracle bone fragment is to locate a matching edge segment of its object fragment images4, the oracle bone fragment images could be rejoined to be whole, and the whole image could be used to calculate a texture continuity score of two images, which is used to describe whether two oracle bone fragment images can be rejoined or not5. As shown in Fig. 1, it is essential that the ground truth of the matching edge segment is given before evaluating the algorithm’s performance. However, in the field of cultural relic fragment restoration, most researchers have not reported their datasets6,7,8,9,10,11,12,13,14,15,16, only several object fragment image datasets have been given17,18,19,20,21,22,23,24, and they have no generic fragment image dataset to evaluate rejoining methods shown in Table 1, seven attributes of related work are given for comparison.
Several datasets have been used but not reported in the restoration field. Shredded money note images can be located quite accurately using the matching SIFT feature6,7, because this type of work has template images. Da Gama Leitao and Jorge8 have manufactured artificial but realistic images by shattering five rectangular unglazed ceramic tiles into 112 main pieces, Wu9 has torn an image into 12 pieces. Digital images of object10,11 have been printed on paper, and each image is torn into multiple pieces, each piece of one object is scanned to be an image set. These pieces are irregular, the color and edge features of their image are extracted to rejoin them, but most of them include one object’s piece images, the real situation is that multiple objects of the same category have broken into many fragments, the dataset’s adaptability is not very ideal. In archeology, Cecilia’s team has extracted 6000 patches from Ostraca images12,13, and 30 Ostraca images are cut into 1000 patches without overlaps to evaluate their algorithms, Graph Neural Networks have been used to reconstruct ancient documents. Document reconstruction14,15,16 used synthetic shred images and real images to estimate the availability of the rejoining method. These pieces are regular artificial fragments or paper fragments.
There are several benchmark image datasets. Ngo has prepared 37,760 samples by breaking 268 images of complete wooden tablets17, which were excavated in the HeijoKyo Palace ruins during the Nara period. Paumard has given the MET (Metropolitan Museum of Art) dataset on a jigsaw puzzle to evaluate their algorithm18,19, each image in the dataset has been resized, square-cropped, and divided into nine parts, although the dataset includes 14,000 open source images from the Metropolitan Museum of Art, it is not suitable for testing irregular fragment reassembly methods. Le has conducted experiments on MIT dataset20 and BGU dataset21, most images in the two datasets have a relatively low resolution, and both of them only contain a single and simple object, therefore, he created a new benchmark dataset22 from copyright-free website pixels, different categories of the image have been downloaded such as streets, mountains, plants, etc., 125 random images are selected and partitioned into training (100) and testing (25) data, so Le has no realistic fragment images and his dataset contains a limited of shredded scene images. Oracle bone fragment image sets23,24 are built by taking photos of their rubbing without texture.
Current challenges
As in the second row shown in Table 1, scene stitching and stereo matching images have overlap patches, the money fragment image is located by the matching template image7, it does not need a dataset, but there are no overlap patches or no general template image available for reference in rejoining object fragment image. Currently, irregular image datasets are very small and most of them have not been reported, datasets such as BGU, MET, MIT, and NNRICP have regular object pieces, and their scale is large, but they are not suitable for evaluating irregular fragment rejoining algorithm, oracle bone rubbing image dataset has no texture information, as shown in Table 1, these datasets have more or less shortcomings in meeting of the seven attributes about related work of rejoining fragment image. To address these challenges, we present the benchmark data set for the field of re-joining fragment image25,26,27,28.
Existing restoration methods have been developed for years to rejoin cultural relic fragments, but there are still several challenges in rejoining oracle bone fragment cultural relics: (1) The different environments under which images are collected in major museums result in varying color spaces of oracle bone fragment images, making it difficult to calculate continuity of oracle bone fragment images. (2) Breakage of the edge of the oracle bone fragment makes it impossible to match their image edge segments. (3) There is no benchmark data set of oracle bone fragment color images to evaluate the rejoining method, so oracle bone fragment rejoining technology is slowly developing. To address these challenges in the image restoration of the oracle bone fragment, this paper gives a two-stage method to rejoin oracle bone fragment images and a benchmark dataset named oracle bone fragment image (OBFI) to evaluate related methods.
Methodology
Assuming that the broken fragments of each oracle bone have been perfectly preserved to be complete, the images of each two neighboring fragments not only have matching edge segments, but also have continuous texture features. This section focuses on training and test datasets, the research methodology utilized to rejoin images of the oracle bone fragment.
Dataset acquisition
Oracle bone fragments are cultural relics, they are too important cultural relics to be seen hurriedly, but they have been photographed and published on the “yinqiwenyuan” website29, we have obtained oracle bone images by downloading them from the “yinqiwenyuan” website. After processing oracle bone fragment images, they are in the same scale space as the original object, the fragment’s real current situation is properly preserved, the edge noise of all images has been removed and their original texture is protected, the dataset only consists of turtle shell fragment images, it contains rich texture information, and more than a hundred of pairwise images can be rejoined together, so oracle bone fragment image dataset is built and named as OBFI-dataset. They are obtained by two methods.
Definition 1
(Image pair and relative complete image). If two fragments’ images can be rejoined together correctly, the two images are named an image pair, if the two images were rejoined to be a whole image, the whole image is not complete of one object, and it is named as relative complete image.
Method 1. Photoshop software. The ground truth rejoined images in this part are obtained by rejoining pairwise images using Photoshop software, the pairwise images are from single oracle bone fragment images, and the archeologist affirmed that each two images are joinable28, the two images are rejoined to be one relative complete image by their matching edge segment, the relative complete images are taken as ground truth and all the images are named by their subimages. Ground truth is used to validate the availability of fragment image rejoining methods.
Method 2. Image processing. There have no so many pairs of color images of oracle bone fragment that can be rejoined, so two methods have been adopted to collect Target Area Images (TAI) defined in paper4. One is performing an edge matching method to get TAI, the other is cutting from oracle bone images and rejoined images. Rejoinable target area images are obtained by cutting from color images of oracle bone fragments and from color images rejoined by the archeologist, the color images are from the “yinqiwenyuan” website, while unrejoinable target area images are obtained using the EEDR algorithm4,5. The data set of the target area image is used to learn the texture continuity from rejoinable fragment images.
Data details
The dataset contains three parts. Part 1 contains four datasets of single oracle bone fragment images, Part 2 contains four datasets of ground truth rejoinable images, and the images in Part 1 and Part 2 are provided to test object fragment image reassembling algorithms and evaluate their performance. The data set in Part 3 is used to train machine learning and deep learning models. The three parts are elaborated.
Part 1: Four subdataset consist of single images of oracle bone fragments. We have provided four representative parts containing oracle bone fragment image, the images are collected by the “yinqiwenyuan” website, which are from Institute of History Chinese Academy of Social Sciences (ZLC)29, Peking University (BZ)30, Lushun Museum (LS)31 and Institute of Oriental Culture University of Tokyo (DJ)32. These subdatasets are respectively named ZLC, BZ, LS, and DJ. The ZLC dataset contains 1161 images as shown in Fig. 2, the BZ dataset contains 2029 images as Fig. 3, the LS dataset contains 1495 images as Fig. 4, and the DJ dataset contains 689 images as Fig. 5. In total, the four subdatasets (they consist of 5374 images in total) are in four different situations, such as yellowing (ZLC), high contrast (BZ), low contrast (LS), and grayscale (DJ), the number of matching image pairs and an overview of their situation is visualized in Table 2:

Image pair in ZLC single oracle bone fragment sub-dataset, (a) and (b) are both from ZLC sub-dataset, (c) is rejoined of ZLC_255 and ZLC_257.

Three images can be rejoined together in BZ single oracle bone fragment image sub-dataset, (a), (b) and (c) are all from BZ sub-dataset, (d) is rejoined of BZ_1159, BZ_2033, and BZ_2318.

Image pair in LS single oracle bone fragment sub-dataset, (a) and (b) are both from LS sub-dataset, (c) is rejoined of LS_443 and LS_1316.

Image pair in DJ single oracle bone fragment sub-dataset, (a) and (b) are both from DJ sub-dataset, (c) is rejoined of DJ_0134 and DJ_0198.
1. Rotated and translated images having the same scale space. Each oracle bone fragment image in the dataset has the same scale space in a high resolution of 600 dots per inch, but they are images of rotated or translated oracle bone fragment.
2. In different color space. Social organizations have preserved oracle bone fragments under different temperatures, humidity, and illumination, which cause different color space of the images, such as yellowing, high contrast, low contrast, and gray scale.
3. 110 oracle bone fragment image pairs. The image pairs are collected under different conditions with ground truth, because the two images of the oracle bone fragment can be rejoined to be a relatively complete image.
4. Each oracle bone fragment has a unique identifier and an image. Each image has a texture in a different color space, it is named by its own subdataset and its collection number, for example, if an image was in the ZLC subdataset and its collection number is 797, it would be named ZLC_797.
Part 2: Four subdatasets contain rejoinable image pairs used as ground truth. As shown in Table 2, the ZLC ground truth dataset includes 23 relative complete images; each relatively complete image is rejoined of two images in the ZLC dataset as Fig. 2c. BZ ground truth dataset includes 41 relative complete images rejoined of image pairs in the BZ dataset as Fig. 3d. LS ground truth dataset includes 36 relative complete images rejoined of image pairs in the LS dataset as Fig. 4c. DJ ground truth dataset includes 10 relative complete images rejoined of image pairs in the DJ image dataset as Fig. 5c. In these subdatasets there are 110 relatively complete images and the ground truth rejoinable position has been given to evaluate rejoining methods. For instance, the relative complete image is named ZLC_797+ZLC_799.jpg, these 23 images are stored in a ZLC folder, this folder is under the “relative complete images” directory, and their single images are saved in another ZLC folder, this folder is under the “single images” directory. Similarly, the same processing for the other three sub-datasets.
Part 3: Two subdatasets consist of target area images. There are not so many pairs of color images of the oracle bone fragment that have been rejoined, so two methods have been adopted to collect images of the target area as Method 2 in Section 3.1. And we have built an image dataset that includes more than 138,855 target area images as Fig. 6, there are 115,893 unrejoinable target area images and 22,962 rejoinable target area images as shown in Table 3, the image dataset is used to learn the texture continuity of the target area image and test the performance of machine learning and deep learning models in the experiment. We have given 110 target area images cut from relatively complete images rejoined by ground truth images and 110 unrejoinable images in practical rejoining work, the dataset is used to test the performance of texture continuity evaluating model of rejoining work. After that, two stages of rejoining oracle bone fragment images can be worked together, edge similarity and texture continuity are used synthetically to determine whether the two images are rejoinable.

Images in (a)–(d) and (i)-(l) are rejoinable target area images, images in (e)–(h) and (m)–(p) are unrejoinable target area images from Part 3 of the dataset, it is used to train the models to learn how to determine two images’ textures are rejoinable or not.
Methods
We have proposed a two-stage method to rejoin the image of the oracle bone fragment. In the first stage, the matching edge segments of two images are achieved by LSES algorithm, the target area image is cut from the red box in the complete image rejoined by the CIR algorithm as Fig. 1. In the process, a distance adaptive threshold is used to discard some images, and the other images that meet the condition are used as alternatives. Each alternative image and its matching image are rejoined to a complete image, from which the target area image is cut to extract texture feature. In the second stage, the target area image or its feature would be input to machine learning or deep learning model to classify by its texture continuity, the machine learning or deep learning model is pre-trained using the part 3 subdataset of the OBFI dataset.
First stage
1. Longest similar edge segment. Based on the Edge Equal Pixel Pejoining algorithm and the Edge Equal Distance Rejoining (EEDR) algorithm4,5, we have proposed the longest similar edge segment (LSES) matching algorithm. As shown in Fig. 7a and in Fig. 7b, firstly, the source image contour segment is rotated multiple times to many coordination sequences as candidate edge segments. In each candidate edge segment of the source image, one-pixel point is sampled interval of every fixed number of pixels. Suppose that the set of points sampled in the edge segment of the source image is represented as Ps={Sm-k, Sm-k+1,… Sm-1, Sm, Sm+1,… Sm+k}, and the distance set is calculated as do={dm-k, dm-k+1,… dm-1, 0, dm+1… dm+k} from each sampling point to the middle pixel Sm of source edge segment. The number of sampling points in the target image edge segment, and the set of distances from each sampling point to the middle pixel Tn of target edge segment dt={dn-k, dn-k+1,… dn-1, dn, dn+1… dn+k} need to be the same as the source images. Suppose the set of sampling points Pt={Tn-k, Tn-k+1,… Tn-1, Tn, Tn+1,… Tn+k} is coordination sequence of target edge segment, a translation vector could be calculated by the middle points Sm and Tn, respectively, of the two edge segments, sampling points of the source image are translated to the target image coordinate system. In the set of candidate edge segments in the source image, searching for a similar edge segment with the minimum distance dsum among the sampling point of multiple edge segments of the target image contour as equation (1). If the distance meets the matching adaptive threshold and the number of sampling points is not increased further, the corresponding edge segment of the sampling point is regarded as the longest similar edge segment. This algorithm can save two-thirds of the time compared to the EEDR algorithm4,5. In Fig. 7e, the image cropped from the blue box has been named Target Area Image (TAI), it is used to calculate the texture similarity of image pairs. In Fig. 7c, matching sampling points have been interconnected, in Fig. 7d, the source image has been rotated to a consistent position with the target image, and the rotated sampling points have been interconnected, in Fig. 7f, g and h, parameter k and interval have been increased to find the longest matching edge segment of two images.

The longest similar edge segment is located by increasing parameter k and internal, and target area image is given to calculate texture continuity score.
2. Complete image rejoining algorithm. After locating the longest matching edge segments of the source and target images, it is necessary to reassemble two images into a whole. A complete image rejoining (CIR) method based on an image mask has been given to transform two images into a whole by matching the pixel coordinate sequence of two image edge segments. Firstly, the rejoined image size is defined as (Hr, Wr), and it is calculated based on eight numbers in equations (2) and (3), they are Xs, Ys, Hs, Ws, Xd, Yd, Hd, and Wd, respectively. Among them, (Xs, Ys) represent coordinates of midpoint pixel of source image edge segment in the source image coordinate system, (Xd, Yd) represent coordinates of midpoint pixel of target image edge segment in the target image coordinate system, source image has a size of (Hs, Ws), Hs and Ws represent its height and width, respectively, target image has a size of (Hd, Wd), and Hd and Wd represent height and width of target image, respectively. The complete image rejoining method is shown in Fig. 8a and b, suppose that relative complete image has a size of (Hr, Wr), we obtain a binary image of the source image and its size (the size of the pink box in Fig. 8a), source image is translated and rotated into the pink box of the relative complete image according to the matching edge segment. Similarly, based on a binary image of the target image, its size (the size of the green box shown in Fig. 8b), and the matching edge segment, the target image is copied to the green box of the relative complete image through a mask. After these operations, two fragment images of one object are rejoined to a relatively complete image. As shown in Fig. 8b, due to the image of the oracle bone fragment being surrounded by its contour, there is a gap in the relatively complete image.
where Ls = Ws – Xs, Ld = Wd – Xd and Ks = Hs – Ys, Kd = Hd – Yd, Eqs. (2) and (3) are ternary operations.

(a) shows the eight numbers in equations 2 and 3, (b) shows the source and target images, which are transformed into the pink box and green box, respectively, and the target area image is in the blue box.
Second Stage
In the process, a distance adaptive threshold is used to discard some images, and the other images that meet the condition are used as alternatives. Each alternative image and its matching image are rejoined to a complete image, from which the target area image is cut to extract texture feature. In the second stage, the feature of the target area image would be input to machine learning or deep learning model to classify by its texture continuity, the machine learning or deep learning model is pre-trained using the part3 subdataset of the OBFI dataset. The deep rejoining model (DRM) uses a two-class classification model to determine whether the two fragments are rejoinable or not in terms of texture based on the feature of the TAI shown in Fig. 1.
Experiments
Implementation details
The experiment has been designed with two steps, the first step is to search image pairs that have matching edge segment and rejoin two images together to a relative complete image, the second step is to give a model to judge whether the relative complete image internal texture is continuity by learning target area image dataset. In the first step, the coordinate sequence of the edge segment and the area3 of the gap of matching edge segments are used to locate the rejoinable position of the two images, the hog feature and the color histogram10 are used to calculate their texture continuity using the correlation of two features, the second step is to use machine learning and deep learning models to evaluate the texture continuity score of TAI. The machine learning models contains Bayessian classifier (Bayes), k-nearest neighbors classifier(KNN), logistic regression (LR), random forest (RForest), decision tree (DTree), gradient boosting (GBoost), support vector machine (SVM), multilayer perceptron (MLP), Kmeans, Ada boosting (ABoost)). The deep learning models contain ResNet50, Inception V3, Xception, AlexNet, VGG19, and MobileNet.
To calculate the longest similar edge segment, one point is sampled every 11 pixels in the source image edge segment, the sampling interval of each image edge segment ranges from 10 to 20, and the distance threshold is set to be 3.5 times the sampling interval and the number of sampling points. After that, a whole image of two fragments is rejoined by the complete image rejoining algorithm4,5. Then, the gradient histogram (hog) and the color histogram of the image3 are used to judge whether two images’ textures can be rejoined together, and the target area image dataset is given to learn the experience of correct determination based on the texture continuity score33,34. Considering the performance of hog features in calculating texture continuity, the hog feature of the target area image is entered into machine learning models, and the target area image is input directly into the deep learning model to train.
Traditional methods
1. Time consumption. The sampling efficiency of LSES and EEDR has been shown in Table 4, SP denotes the number of sampling points along one edge segment with 1420 pixels, and the first column denotes the interval or distance between two sampling points used by LSES or EEDR algorithm, in the third row, the second number, 8, denotes that LSES has cost 8 ms to sample 10 points (SP = 10), one point is sampled interval 11 pixels (Interval = 11). As can be seen, when the numbers of sampling points by LSES and EEDR algorithm are equal at the same interval or distance, LSES costs less time than EEDR, and when the interval or sampling points are increased to be the same number, the time cost by LSES algorithms is stable or changed less, the time cost by LSES algorithms increases more, LSES cost about half time of EEDR algorithm. It shows that our LSES has a higher efficiency than the EEDR algorithm.
2. Invariance of rotation and translation. The rotation and translation invariance of LSES and EEDR have been shown in Table 5, the location performance of LSES and EEDR algorithms is given, the article focuses more on the rejoining of image pairs, the location from A to B and from B to A are different situations, there are 207 image pairs correctly located by LSES algorithm, and it has given 13 pairs of dislocated images, EEDR algorithm has given 18 image pairs dislocated. It shows that LSES has better performance than the EEDR algorithm in locating matching edge segments of rotated and translated images.
3. Invariance of different color space. As shown in Fig. 2, there are 23 pairs of joinable images in the ZLC subdataset of single yellowing oracle bone fragment images. In Table 6, the first column is the name of the oracle bone fragment image, the second column is the sum of the distance between the edge segment sample points of the two images, the third column is the gap area in the relative complete image rejoined by two images, the fourth column is color histogram correlation of two images, the fifth column is hog feature similarity of two images. When giving one image of an oracle bone fragment, more than five hundreds of its matching images can be searched among the ZLC subdataset, and the distance, gap area, color histograms and the hog feature evaluating scores of two images meet limiting conditions, the parameter k and interval are set to be 6 and 11, respectively, the distance threshold is set to be 3.5*2*(k–1) and the gap area threshold is set to be 2*30*(k–1)*interval. If there exists a target that can be correctly rejoined with the source image, it could be screened out by the limitation of the distance and the gap area, as shown in Table 6, and the distance between the sampling points of each two images’ edge segment ranges from 7.83 to 34.92, and the gap areas of the relative complete images range 85 to 326. The hog feature corrections mostly range from 0.50 to 0.88, most color histogram similarities are higher than 0.45, and only one is lower than 0.45, and further statistical analysis is needed for their effectiveness.
As shown in Table 7, there are 41 pairs of images in the BZ subdataset, and two images of each pair can be correctly regrouped to be a whole image, most images have high contrast in the data set. The parameter k and interval are set to be 6 and 11, respectively, the distances range from 7.24 to 33.47, the gap areas range from 127 and 314, and most hog feature corrections range from 0.41 to 0.97. The color histogram similarities of them are higher, but four pairs of them have similarity scores of 0.29, 0.27, 0.26, and even 0.17, as shown in Fig. 3, three images can be rejoined together, but they have different color space, so color histogram similarity does not work well.
As shown in Table 8, there are 36 pairs of images in the LS subdataset, and two images of each pair can be correctly rejoined to be a whole image, most images have low contrast in the LS subdataset. The parameter k and interval are set to 5 and 11, respectively, the distances range from 5.41 to 19.32, the gap areas range from 43 to 234, and the hog feature corrections range from 0.19 to 0.90. But some pairs of rejoinable images have color histogram similarity scores of 0.06, 0.04, and even 0.03, as shown in Fig. 4, because the two images have low contrast.
As shown in Table 9, there are 10 pairs of DJ fragment images, each two images can be correctly rejoined, and the parameters k and interval are set to be 6 and 11, respectively. Because all images in the DJ subdataset are gray, matching images have no color histogram similarity score, as shown in Fig. 5 and Table 9, the distances range from 10.07 to 34.46, the gap areas range from 100 to 279, the hog feature corrections range from 0.52 to 0.94. The performance test on the DJ dataset is the best.
4. Evaluation of the traditional method. Under the limitation of distance and gap area, matching oracle bone fragment images can be searched within many unrejoinable images, but the two methods have worked in rejoining oracle bone fragment images. Because of different storage environments, many rejoinable oracle bone fragments have a variety of colors, color histogram similarity does not work well in the rejoining job; theoretically, the grayscale gradient directions of two images of one object could maintain their consistency, hog feature correction has discriminability to judge whether two image textures are continuity or not, as shown in Tables 6–9, the hog feature is common and effective method used in pedestrian tracking, and it is also suitable for object fragment image rejoining tasks. But in this field, further experiment and statistical analysis are needed for the datasets and the methods’ effectiveness, classical machine learning and deep learning methods should be used in rejoining work.
In rejoining large-scale oracle bone fragment images, when a source image is given, the method has to search all images in a single image sub-dataset to find its matching target images, but only one fragment image can be correctly rejoined to a segment of the source image contour or the correct target image is lost. If two rejoinable images exist in the sub-dataset, if a method searches the image pair, and gives a matching score, which is topper than 300 in the score rank, archeologists could appreciate this method’s practicality. As shown in Table 10, the average distance ranking is 188, and the ranking meets the needs of archeologists in the distance result set of the LSES algorithm, this means that the distance of sampling points in the edge segment of oracle bone fragment images works well.
TopK evaluating indicator is used to show the performance of the different evaluating items. As shown in Fig. 9, the number on the horizontal axis represents the ranks of the evaluating items such as similarity of the distance, the gap area, the hog feature, and the color histograms, the number on the vertical axis represents the probability of correctly rejoined oracle bone fragments. When the minimum distance ranking ranges from 1 to 300, correctly rejoined combinations account for 76.82%, and the trend of probability gradually decreases in the ranks, it shows that the method of calculating the minimum distance between two image edge sampling points is effective, and the method has searched dozens of rejoinable images of the oracle bone as shown in Fig. 10. When the gap area between rejoined fragment image ranks ranges from 1 to 400, correctly rejoined combinations account for 53.88%, although the probability decreases less in the rank, but this method worked as shown in Fig. 9b. As shown in Fig. 9c and d, the Hog and the color histogram are used to evaluate the continuity score between two images of the oracle bone fragment, but the correlation of the hog features can work effectively, because the color histogram has lower discrimination, so the similarity of the color histogram is almost invalid, the reason is that the texture of fragment image has continuity but the color of the rejoinable oracle bone fragments are not the same color, these are all unsupervised methods.

Oracle bone fragment images and their reassembling, (a), (b), (c) and (d) are proportions of topk performance, and the first rectangular box in (a) means the sum proportion of percentages of top1-100, the interval is 100.

a Shows matching edge segments of the source and target images, which are from LS single image sub-dataset, (b) shows the rejoined image, (c) and (d) are front and opposite photos of rejoined oracle bone fragment in LS museum.
Machine learning and deep learning models
A two-stage method has been adopted to rejoin the images of the oracle bone fragment. The second stage is to calculate the texture correlation of the target area image using machine learning or deep learning models. The image dataset included more than 138,855 target area images as Fig. 8, specifically, 115,893 unrejoinable target area images and 22,962 rejoinable target area images, the dataset is partitioned into training and test sets with a 3:1 ratio, which is used to evaluate machine learning and deep learning methods, as shown in Table 11, acc indicates accuracy, rec indicates recall, pre indicates precision and f1-sc indicates f1-score, some classical machine learning methods have been tested on our dataset, these methods include Bayes, KNN, LR, RForest, DTree, GBoost, SVM, MLP, Kmeans, ABoost, most methods’ train, and test accuracy is higher than 70%, and seven methods’ f1-score is higher than 60%, but on the work of pratical rejoining dataset, including 110 rejoinable target area images and 110 unrejoinable target images, and these machine learning methods’ accuracy and f1-score is lower than 64%. Deep learning methods (such as ResNet50, Inception V3, Xception, AlexNet, VGG19, and MobileNet) have been tested on the same data set and most of these deep learning methods only have a better training and recall performance, AlexNet achieves a general effect, its Roc-Auc score is higher than 70%. Supervised deep learning and machine learning methods will miss out on correct rejoinable combinations, the traditional methods can search for rejoinable images by a threshold, but both edge similarity and texture continuity should be used synthetically to effectively determine whether two images are rejoinable.
Two stage using LSES and AlexNet
Now that, as shown in Table 11, the Bayesian classifier has the best performance among machine learning models, AlexNet has the best performance among deep learning models, so in the second stage of DRM, we use Bayesian classifier and deep learning model to learn how to judge whether two image’s textures are continuity, performance of EEDR, LSES and LCS algorithms are compared, because LCS algorithm is the most representative and general methods10, which is particularly suitable for the task of image rejoining of oracle bone fragments, as shown in Table 12, iPair denotes image pairs searched by these algorithms, mRank denotes the mean ranking, rTop200 denotes the ratio of top 200. EEDR and LSES have found more than 200 image pairs of the oracle bone fragment, but the mean rank of EEDR is 188, the LCS algorithm has found 73 pairs, the mean rank is less than 78, the proportion of the top 200 is 30.46%, the issue of missing pixels at the edges is the cause of its low performance, the coordinate sequences of pixels at the fracture location of the oracle bone in the two images are not completely equal. LCS has strict constraints, while the LSES algorithm has relaxed constraints, avoiding the calculation of a common pixel coordinate sequence in the first stage, so that the LSES has been improved to the best performance comprehensively. After adding a second stage to calculate the texture continuity of the image of target area, the DRM combining LSES and AlexNet has the best performance, although it has found 181 image pairs of the oracle bone fragment, but its mean rank is 151, which is better than LSES, VGG19 has the same performance with the LSES algorithm, and it shows that it has not improved than using the LSES algorithm only.
Conclusion and discussion
The function of the oracle bone fragment image dataset is confirmed in the practice of rejoining oracle bone fragments, although rejoinable images have been searched for such a long time, whether by archeologists or programs, and there are still undiscovered pairs of rejoinable images among the four subdatasets of the single image of the oracle bone fragment. The challenge is that each sub-dataset has thousands of images, and so many segment and texture-matching images, which makes current methods’ accuracy and efficiency improved lower, nevertheless, related technologies can search out rejoinable image pairs.
Responses