Hough Transform
December 31, 2019
Histogram of Oriented Gradients Feature Descriptor
January 2, 2020
Show all

Hough Transform Implementation With Python

The Hough transform (Duda and Hart, 1972), which started out as a technique to detect lines in an image, has been generalised and extended to detect curves in 2D and 3D.

Here, we understand how an image is transformed into the hough space for line detection and implement it in Python.

How it works – gradient-intercept parameter space

A line in Cartesian form has the equation:

y = mx + b

m = gradient or slope of the line (rise/run)   
b = y-intercept

Given a set of edge points or a binary image indicating edges, we want to find as many lines that connect these points in image space.

Say we have 2 edge points (x1,y1x1,y1) and (x2,y2x2,y2). For each edge point at various gradient values (mm=-0.5, 1.0, 1.5, etc), we calculate the corresponding bb values. The image below shows the various lines through an edge point in image space and the plot of these lines in parameter space. Points which are collinear in the cartesian image space will intersect at a point in mm-bb parameter space.



All points on a line in image space intersect at a common point in parameter space. This common point (mm, bb) represents the line in image space.

Unfortunately, the slope, mm, is undefined when the line is vertical (division by 0!).
To overcome this, we use another parameter space, the hough space.

How it works – angle-distance parameter space

A line in Polar coordinate system has the equation:

ρ = x cos θ + y sin θ

ρ (rho) = distance from origin to the line. [-max_dist to max_dist].
          max_dist is the diagonal length of the image.  
θ = angle from origin to the line. [-90° to 90°]

If you’re interested in how ρρ is derived, I’ve laid out the maths at the bottom extras section: deriving rho.

To explain the hough transform, I’ll use a simple example. I’ve created an input binary image of size 30 x 30 pixels with points at (5, 25) and (20, 10) shown below. The image is transformed to the hough space by calculating ρρ with a point at each angle from −90∘−90∘ to 90∘90∘ (negative angles are anti-clockwise starting horizontally from the origin and positive angles are clockwise). The points in hough space make a sinusoidal curve.

The value of ρρ at various angles for the 2 edge points:

(5, 25)-25-1452125
(20, 10)-107202110

We see that the curves in hough space intersect at 45∘45∘ with ρ=21ρ=21.
Curves generated by collinear points in the image space intersect in peaks (ρ,θ)(ρ,θ) in the Hough transform space. The more curves intersect at a point, the more “votes” a line in image space will receive. We’ll see this in the next implementation section.

Algorithm steps

  1. Corner or edge detection. (E.g. using cannysobeladaptive thresholding). The resultant binary/grey image will have 0s indicating non-edges and 1s or above indicating edges. This is our input image.
  2. Rho range and Theta range creation. ρρ ranges from -max_dist to max_dist where max_dist is the diagonal length of the input image. θθ ranges from −90∘−90∘ to 90∘90∘. You can have more or less bins in the ranges to tradeoff accuracy, space and speed. E.g. Every third angle in −90∘−90∘ to 90∘90∘ to reduce from 180 to 60 values.
  3. Hough accumulator of θθ vs ρρ. It is a 2D array with the number of rows equal to the number of ρρ values and the number of columns equal to the number of θθ values.
  4. Voting in the accumulator. For each edge point and for each θθ value, find the nearest ρρ value and increment that index in the accumulator. Each element tells how many points/pixels contributed “votes” for potential line candidates with parameters (ρ,θ)(ρ,θ).
  5. Peak finding. Local maxima in the accumulator indicates the parameters of the most prominent lines in the input image. Peaks can be found most easily by applying a threshold or a relative threshold (values equal to or greater than some fixed percentage of the global maximum value).

Code Time

Full code on  github

import numpy as np

def hough_line(img):
  # Rho and Theta ranges
  thetas = np.deg2rad(np.arange(-90.0, 90.0))
  width, height = img.shape
  diag_len = np.ceil(np.sqrt(width * width + height * height))   # max_dist
  rhos = np.linspace(-diag_len, diag_len, diag_len * 2.0)

  # Cache some resuable values
  cos_t = np.cos(thetas)
  sin_t = np.sin(thetas)
  num_thetas = len(thetas)

  # Hough accumulator array of theta vs rho
  accumulator = np.zeros((2 * diag_len, num_thetas), dtype=np.uint64)
  y_idxs, x_idxs = np.nonzero(img)  # (row, col) indexes to edges

  # Vote in the hough accumulator
  for i in range(len(x_idxs)):
    x = x_idxs[i]
    y = y_idxs[i]

    for t_idx in range(num_thetas):
      # Calculate rho. diag_len is added for a positive index
      rho = round(x * cos_t[t_idx] + y * sin_t[t_idx]) + diag_len
      accumulator[rho, t_idx] += 1

  return accumulator, thetas, rhos


# Create binary image and call hough_line
image = np.zeros((50,50))
image[10:40, 10:40] = np.eye(30)
accumulator, thetas, rhos = hough_line(image)

# Easiest peak finding based on max votes
idx = np.argmax(accumulator)
rho = rhos[idx / accumulator.shape[1]]
theta = thetas[idx % accumulator.shape[1]]
print "rho={0:.2f}, theta={1:.0f}".format(rho, np.rad2deg(theta))

rho=0.50, theta=-45

Hough transform (and the faster probabilistic version) is available in openCV and scikit-image.


Hough transform can be extended to detect circles of the equation
r2=(x−a)2+(y−b)2r2=(x−a)2+(y−b)2 in the parameter space, ρ=(a,b,r)ρ=(a,b,r).
Furthermore, it can be generalized to detect arbitrary shapes (D. H. Ballard, 1981).

Another approach is the Progressive Probabilistic Hough Transform (Galamhos et al, 1999). The algorithm uses random subsets of voting points in the accumulator and checks for the longest segment of pixels with minimum gaps. Line segments that exceed a minimum length threshold are added to the list. This returns the beginning and end point of each line segment in the image. It has 3 thresholds: a minimum number of votes in the Hough accumulator, a maximum line gap for merging and a minimum line length.


Deriving rho: ρ = x cos θ + y sin θ

With basic trigonometry, we know that for right-angled triangles,
sinθ=oppositehypotenusesinθ=oppositehypotenuse and cosθ=adjacenthypotenusecosθ=adjacenthypotenuse.

We want to convert the cartesian form y=mx+by=mx+b with parameters (m,b)(m,b) to polar form with parameters (ρ,θ)(ρ,θ).

The line from the origin with distance ρρ has a gradient of sinθcosθsinθcosθ. The line of interest, which is perpendicular to it, will have a negative reciprocal gradient value of −cosθsinθ−cosθsinθ. For bb, the y-intercept of the line of interest, sinθ=ρbsinθ=ρb.

With m=−cosθsinθm=−cosθsinθ and b=ρsinθb=ρsinθ, we get ρ=x cosθ+y sinθρ=x cosθ+y sinθ.

Amir Masoud Sefidian
Amir Masoud Sefidian
Data Scientist, Machine Learning Engineer, Researcher, Software Developer

Comments are closed.