Recognition of lane markings in factories and self-position estimation method using AR markers

In recent years, many unmanned transfer robots have been introduced in factories and warehouses. Functions such as self-position estimation are indispensable for freely operating automated guided vehicles. In this research, we estimated the self-position in the factory, proposed a robot control method using lane marking, and verified the measurement accuracy of the system in real time. In this study, the camera image is used to read the AR marker and lane markings to calculate the distance between the camera and lane markings and estimate the self-position. In this study, lane markings and AR markers are photographed horizontally with a camera. The distance is calculated from the position and tilt of the lane markings on the image. When the AR marker is detected, the camera is calibrated to calculate the distance and angle, and the self-position is estimated by comparing it with the actual coordinates. As an experiment to measure the distance to the lane marking, the distance was calculated by gradually bringing the camera closer to the stationary camera with a thick paper with a thickness of 30 mm, which is likened to the lane marking. In the distance calculation, two experiments were conducted with the camera oriented horizontally and diagonally with respect to the lane marking. As an experiment of self-position estimation using AR markers, we created a model like a passage in a factory, placed cameras at multiple points, and measured the error from theoretical values. As a method of expressing the self-position, I assigned the x-axis and y-axis to the model in the actual coordinate system and expressed it in two dimensions. In both experiments, in order to verify the accuracy, 100 continuous data were acquired at each point and the variability of the data was investigated.


Introduction
In recent years, the delivery record of automated guided vehicle systems has been on the rise, which indicates that the demand for automated guided vehicle systems is increasing. [1]In the future, considering the decrease in workers and the problem of running costs, it is expected that attention will be paid to unmanned factories. [2]However, the current situation is that the spread of automated guided vehicle systems in Japan is lagging behind that in other countries. The reasons for this are that the initial cost for introducing the system is high, and that it is difficult to introduce the system because the passages in Japanese factories and warehouses are narrower than those in overseas factories and warehouses.
Most of the automated guided vehicle systems currently in use are low-priced but monotonous routes, or various routes can be set by installing various sensors, but the price is high. There are many things that end up. Ease and accuracy of route change are indispensable for automatic guided vehicles, and the decrease in efficiency caused by losing sight of the route is the most important problem to avoid. However, it is inevitable that the implementation will be costly due to the complexity of the system for the purpose of improving accuracy. Therefore, in this research, we are developing a method using a camera and an AR marker as a system that can be implemented at low cost and estimates the self-position of the transfer robot in the factory. [4] This time, we focused on real-time self-position estimation of this system and reduction of error.

Summary of detection method
In this study, we use a camera to capture the lane markings and AR markers for measurement. Figure 1 gives an overview of the system used in this study. Place the camera at the center of the lens on the floor or at a position of 180 [mm], aim the camera so that it is parallel to the floor, and capture the image. Figure 2 shows an image captured from the camera. As you can see from this figure, the lane markings and AR markers in the projected area are detected.

Preprocessing in image processing
Grayscale transformation, histogram flattening, masking, edge detection, and Hough transform were performed as preprocessing for lane marking detection. Grayscale conversion is performed to speed up image processing. Histogram flattening is done to make it easier to analyze the image. In this experiment, only the lower half of the image captured by the camera has the necessary information, so the upper half is masked. Edge detection is performed to emphasize the lane markings for the Hough transform. The Hough transform is performed to extract only the dividing line from the image [3]. Camera calibration is performed as a preprocessing for estimating the self-position from the AR marker. Distortion of the camera image can be corrected by performing camera calibration.

Principle of calculating the distance from the lane marking after the lane marking is detected
This section describes the calculation principle for finding the distance from the lane marking from the image after lane marking detection.
By applying the image input from the camera to the processing flow shown in 2.2, an image as shown in Fig. 4 can be obtained if there is a lane marking in the camera field of view.

Figure 3: Example of image detection image
The obtained image has the x-axis in the horizontal direction and the y-axis in the vertical direction with the upper left as the origin, and is converted to the actual length based on the number of pixels( 1 , 2 , 1 , 2 )of the camera. Figure 4 below shows the relationship between the shooting range of the camera. If the distance 0 from the lens where the angle of view is cut off when the height of the camera is , it can be calculated from the image sensor size ( in the vertical direction, in the horizontal direction) and can be obtained by the following equations (1) . The maximum shooting range ( in the vertical direction and in the horizontal direction) at a distance z away from the lens can be obtained by the following equations (2) and (3), respectively.
Figure 5 below shows the positional relationship between the camera of this system and the ground.

Figure 5: Coordinate relationship between camera and ground
The camera is placed on the y-axis with the origin as the reference, and the camera is pointed in the z-axis direction to take a picture. When there is a lane marking in the shooting range, the starting point of the lane marking at the bottom on the screen is defined as 1 ( 1 , 1 ), and p2 as p2( 2 , 1 + 20).

Calculation of the distance from the lane marking in the side direction of the camera
In this section, the distance from the lane marking in the side direction of the camera is calculated. If 1 and 2 are set as shown in Fig. 5, 1 and 2 can be obtained by equations (4) and (5).
Next, let the number of pixels of the distance between the z-axis and the points 1 and 2 on the image be 1 and 2 , respectively. Furthermore, by combining Eqs. (8) and (9) with the obtained 1 and 2 , the maximum horizontal shooting range 1， and 2 at each point can be obtained.
By substituting the values calculated in Eqs. (10) and (11) described below, the distances 1 and 2 between the z-axis and the points 1 and 2 in the real coordinate system can be obtained, respectively.
Next, Fig. 6 shows the coordinate relationship of each point when the camera is viewed from directly above. From this, the distance W from the lane marking in the side direction of the camera can be calculated by finding the intersection of the line connecting the points and the x-axis.

Calculation of the shortest distance between the camera and the marking line
This section shows the principle of calculating the shortest distance between the camera and the marking line. Figure 7 shows the coordinate relationship of each point when the camera is viewed from directly above. If the intersection of the straight line connecting points P1 and P2 and the z-axis is P3 (0, 3 ), the distance 3 can be obtained in the same way as 1 and 2 . Based on the obtained values, the shortest distance l between the camera and the marking line can be calculated by Eq. (12).

Principle of camera position estimation after AR marker detection
This section describes the calculation principle for finding the distance between the camera and the AR marker from the image after the AR marker is detected.

Camera calibration
In this study, camera calibration was performed and the parameters of the camera image sensor were estimated. The position of the camera is estimated by using the estimated parameters. Figure 8 shows the checkerboard used for calibration. In this study, in order to improve the accuracy of the parameters, the checkerboard was photographed while moving the camera, 40 sample images were acquired, and the corners of the checkerboard were detected from the images. The camera parameters and distortion parameters are calculated by comparing the coordinates of the corners read from the image with the known local coordinates. The camera parameters and distortion parameters are the values used when performing distortion correction.
When taken with a camera, the correction of radial distortion, which is actually a straight line but is displayed as a curved line, is as shown in Eqs. (13) and (14).
Similarly, the tangential distortion, which is the distortion caused by the lens and the image plane not being perfectly parallel, is corrected by Eqs.
Next, the internal and external parameters of the camera will be described. The internal parameters, which are camera-specific values, are expressed by the following equation (17).
External parameters are rotation and translation parameters for converting the coordinates of a 3D point in one coordinate system to the coordinates in another coordinate system.

Camera position estimation from AR markers
In this study, we first estimated the distance of the camera with the center of the AR marker as the center of the 3D coordinate system. Figure 9 shows an image in which the read data is drawn with the 3D coordinate system for the image of the read AR marker.

Figure 9: AR marker that draws a 3D coordinate system
As shown in Fig. 10, in this study, the size and number of bits of the AR marker are set in advance, and the straight-ahead vector and rotation vector for the AR marker of the camera are set using the camera parameters and distortion parameters explained in the previous section. The position c ( , , ) of the camera in the three-dimensional coordinate system was estimated from the straight-ahead vector.

Figure 10: Position of AR marker and camera in 3D coordinate system
As a premise of this study, the distance between the AR marker and the camera on the xy plane should be obtained, so the height ℎ of the camera was set as a fixed value and calculated by Eq. (18).
Next, find the positional relationship between the AR marker and the camera. The vector l ( , ) was obtained from the obtained values of and , , and the angle formed by the unit vector n (1,0) in the x-axis direction was obtained from Eq. (19).
The position of the camera was estimated from the AR marker by adjusting the distance to the camera.

About real-time system
In this experiment, the processing unit was a 64-bit operating system, the processor was (Intel (R) Core (TM) i5-9400F CPU @ 2.90GHz 2.90GHz), and the webcam BB-SW174WA (for video acquisition). Real-time processing was realized using the manufacturer: Panasonic). Regarding webcams, Fig. 11 below shows the cameras used for real-time operation of this system, and Table 1 shows each parameter. In this system, spider was used as the development environment and pyhon was used as the language to process the video captured by the webcam.

Overview of lane marking accuracy verification
In this experiment, we created a model to make it easier to move the lane markings and verified the detection accuracy. Even if it is in a stationary state, the effect of light sources such as fluorescent lamps around it on each pixel value changes over time. Therefore, in this section, we verified the data scatter when a certain number of data were continuously acquired in real time when the camera and the lane marking were stationary, and the measurement accuracy of the detection system itself.

Experimental results
The table below summarizes the measurement errors and data scatter from the results of verifying the measurement accuracy of the camera in the stationary state.  Table 3: Measurement error and data scatter when the camera is tilted From the above results, when the camera is horizontal to the lane marking, the distance can be measured with an error of at most 18 [mm] compared to the theoretical value, so practical lane marking detection is possible. It is considered possible. In addition, when the camera is horizontal to the lane marking, the distance can be measured with an error of at most 27 [mm] compared to the theoretical value, so practical lane marking detection is possible. it is conceivable that. There was almost no scattering of data when the minimum value was subtracted from the maximum value from the measurement results of each distance, and there was almost no scattering when shooting horizontally or at an angle, but to some extent at some distances. Sometimes scattered. This is considered to be an error caused by using thick paper with a thickness of 30 [mm] as the dividing line in this experiment

AR marker detection accuracy verification
In this experiment, self-position estimation was performed using a model. Even in the stationary state, the influence of the surrounding light sources such as fluorescent lamps on each pixel value changes with time. Therefore, in this section, we verified the scattering of data when a certain number of data were continuously acquired in real time when the camera and AR marker were stationary, and the measurement accuracy of the self-position estimation system itself. Figure 3 shows a schematic diagram of the model used in this experiment. In this experiment, the accuracy of the stationary model was verified by moving the camera to five arbitrary points. The orientation of the camera was verified in the direction that may be suitable for transportation in the actual factory. In this experiment, the lower left of Fig. 3 was set as (0,0) on the coordinates, the coordinates corresponding to the center of each marker were assigned, and the positional relationship with the camera was calculated from there to estimate the self-position. In verifying the accuracy of self-position estimation using AR markers, 100 consecutive data were recorded at each point. We will compare the scatter of the recorded data and the error from the true value. Use 50 [mm] x 50 [mm] thick paper for the AR marker.

Experimental result
The results obtained in this experiment are shown below. From the above results, regarding the self-position estimation by the AR marker, the maximum error is 16 [mm] in the x direction and 21 [mm] in the y direction, except for the points in the real coordinates (500,1000). Since the self-position can be estimated by, it is considered that practical obstacle detection is possible. For the points in real coordinates (500,1000), the maximum error was 1011 [mm] for the x-axis and 460 [mm] for the y-axis. It is considered that this is because the z axis flipping phenomenon that the z axis is inverted when the AR marker is read has occurred. This phenomenon occurs because it is not possible to judge from the image whether the AR marker is hanging from the ceiling or attached to the ground. This phenomenon is unlikely to occur when the distance to the AR marker is short, so we would like to devise ways to reduce false recognition as much as possible in the future.

Conclusions
In this research, in order to allow the robot to estimate its own position in the factory and travel freely to the target point, the approach is to use AR markers and lane markings, which are different from conventional automatic transfer robots, from the lane markings. The distance was measured and the self-position was estimated using the AR marker. The points that can be improved by this system compared to the conventional method are described below.
Since it is possible to recognize the lane markings originally drawn in the factory or warehouse and measure the distance, there is no need to draw a new magnetic tape on the path as in the line tracing method. Since this system uses image processing using a camera, the camera information can be used for other purposes such as obstacle detection. Unlike the conventional self-position estimation method using SLAM, it can be operated immediately by simply installing an AR marker on the floor. Compared to installing a reflector as a landmark, AR markers can be mounted flexibly because there are few restrictions on the installation location. Unlike the method of installing the QR code on the floor, the self-position can be estimated even if it is far from the QR code, so even if there is an obstacle on the route, the self-position can be estimated after avoiding it. ..