The Hough transform (Duda and Hart, 1972), which started out as a technique to detect lines in an image, has been generalised and extended to detect curves in 2D and 3D.

Here, we understand how an image is transformed into the hough space for line detection and implement it in Python.

A line in Cartesian form has the equation:

y = mx + b where: m = gradient or slope of the line (rise/run) b = y-intercept

Given a set of edge points or a binary image indicating edges, we want to find as many lines that connect these points in image space.

Say we have 2 edge points (x1,y1x1,y1) and (x2,y2x2,y2). For each edge point at various gradient values (mm=-0.5, 1.0, 1.5, etc), we calculate the corresponding bb values. The image below shows the various lines through an edge point in image space and the plot of these lines in parameter space. Points which are collinear in the cartesian image space will intersect at a point in mm-bb parameter space.

mb_space

**All points on a line in image space intersect at a common point in parameter space. This common point (mm, bb) represents the line in image space.**

Unfortunately, the slope, mm, is undefined when the line is vertical (division by 0!).

To overcome this, we use another parameter space, the hough space.

A line in Polar coordinate system has the equation:

ρ = x cos θ + y sin θ where: ρ (rho) = distance from origin to the line. [-max_dist to max_dist]. max_dist is the diagonal length of the image. θ = angle from origin to the line. [-90° to 90°]

If you’re interested in how ρρ is derived, I’ve laid out the maths at the bottom extras section: deriving rho.

To explain the hough transform, I’ll use a simple example. I’ve created an input binary image of size 30 x 30 pixels with points at (5, 25) and (20, 10) shown below. The image is transformed to the hough space by calculating ρρ with a point at each angle from −90∘−90∘ to 90∘90∘ (negative angles are anti-clockwise starting horizontally from the origin and positive angles are clockwise). The points in hough space make a sinusoidal curve.

The value of ρρ at various angles for the 2 edge points:

point | −90∘−90∘ | −45∘−45∘ | 0∘0∘ | 45∘45∘ | 90∘90∘ |
---|---|---|---|---|---|

(5, 25) | -25 | -14 | 5 | 21 | 25 |

(20, 10) | -10 | 7 | 20 | 21 | 10 |

We see that the curves in hough space intersect at 45∘45∘ with ρ=21ρ=21.

Curves generated by collinear points in the image space intersect in peaks (ρ,θ)(ρ,θ) in the Hough transform space. The more curves intersect at a point, the more “votes” a line in image space will receive. We’ll see this in the next implementation section.

**Corner or edge detection**. (E.g. using canny, sobel, adaptive thresholding). The resultant binary/grey image will have 0s indicating non-edges and 1s or above indicating edges. This is our input image.**Rho range and Theta range creation**. ρρ ranges from -max_dist to max_dist where max_dist is the diagonal length of the input image. θθ ranges from −90∘−90∘ to 90∘90∘. You can have more or less bins in the ranges to tradeoff accuracy, space and speed. E.g. Every third angle in −90∘−90∘ to 90∘90∘ to reduce from 180 to 60 values.**Hough accumulator**of θθ vs ρρ. It is a 2D array with the number of rows equal to the number of ρρ values and the number of columns equal to the number of θθ values.**Voting in the accumulator**. For each edge point and for each θθ value, find the nearest ρρ value and increment that index in the accumulator. Each element tells how many points/pixels contributed “votes” for potential line candidates with parameters (ρ,θ)(ρ,θ).**Peak finding**. Local maxima in the accumulator indicates the parameters of the most prominent lines in the input image. Peaks can be found most easily by applying a threshold or a relative threshold (values equal to or greater than some fixed percentage of the global maximum value).

Full code on github

import numpy as np def hough_line(img): # Rho and Theta ranges thetas = np.deg2rad(np.arange(-90.0, 90.0)) width, height = img.shape diag_len = np.ceil(np.sqrt(width * width + height * height)) # max_dist rhos = np.linspace(-diag_len, diag_len, diag_len * 2.0) # Cache some resuable values cos_t = np.cos(thetas) sin_t = np.sin(thetas) num_thetas = len(thetas) # Hough accumulator array of theta vs rho accumulator = np.zeros((2 * diag_len, num_thetas), dtype=np.uint64) y_idxs, x_idxs = np.nonzero(img) # (row, col) indexes to edges # Vote in the hough accumulator for i in range(len(x_idxs)): x = x_idxs[i] y = y_idxs[i] for t_idx in range(num_thetas): # Calculate rho. diag_len is added for a positive index rho = round(x * cos_t[t_idx] + y * sin_t[t_idx]) + diag_len accumulator[rho, t_idx] += 1 return accumulator, thetas, rhos

Usage:

# Create binary image and call hough_line image = np.zeros((50,50)) image[10:40, 10:40] = np.eye(30) accumulator, thetas, rhos = hough_line(image) # Easiest peak finding based on max votes idx = np.argmax(accumulator) rho = rhos[idx / accumulator.shape[1]] theta = thetas[idx % accumulator.shape[1]] print "rho={0:.2f}, theta={1:.0f}".format(rho, np.rad2deg(theta))

rho=0.50, theta=-45

Hough transform (and the faster probabilistic version) is available in openCV and scikit-image.

Hough transform can be extended to detect circles of the equation

r2=(x−a)2+(y−b)2r2=(x−a)2+(y−b)2 in the parameter space, ρ=(a,b,r)ρ=(a,b,r).

Furthermore, it can be generalized to detect arbitrary shapes (D. H. Ballard, 1981).

Another approach is the Progressive Probabilistic Hough Transform (Galamhos et al, 1999). The algorithm uses random subsets of voting points in the accumulator and checks for the longest segment of pixels with minimum gaps. Line segments that exceed a minimum length threshold are added to the list. This returns the beginning and end point of each line segment in the image. It has 3 thresholds: a minimum number of votes in the Hough accumulator, a maximum line gap for merging and a minimum line length.

With basic trigonometry, we know that for right-angled triangles,

sinθ=oppositehypotenusesinθ=oppositehypotenuse and cosθ=adjacenthypotenusecosθ=adjacenthypotenuse.

We want to convert the cartesian form y=mx+by=mx+b with parameters (m,b)(m,b) to polar form with parameters (ρ,θ)(ρ,θ).

The line from the origin with distance ρρ has a gradient of sinθcosθsinθcosθ. The line of interest, which is perpendicular to it, will have a negative reciprocal gradient value of −cosθsinθ−cosθsinθ. For bb, the y-intercept of the line of interest, sinθ=ρbsinθ=ρb.

With m=−cosθsinθm=−cosθsinθ and b=ρsinθb=ρsinθ, we get ρ=x cosθ+y sinθρ=x cosθ+y sinθ.