Coverage for local/lib/python2.7/site-packages/sage/graphs/distances_all

Hot-keys on this page

r m x p toggle line displays

j k next/prev highlighted chunk

0 (zero) top of page

1 (one) first highlighted chunk

# cython: binding=True

r"""

Distances/shortest paths between all pairs of vertices

This module implements a few functions that deal with the computation of

distances or shortest paths between all pairs of vertices.

**Efficiency** : Because these functions involve listing many times the

(out)-neighborhoods of (di)-graphs, it is useful in terms of efficiency to build

a temporary copy of the graph in a data structure that makes it easy to compute

quickly. These functions also work on large volume of data, typically dense

matrices of size `n^2`, and are expected to return corresponding dictionaries of

size `n^2`, where the integers corresponding to the vertices have first been

converted to the vertices' labels. Sadly, this last translating operation turns

out to be the most time-consuming, and for this reason it is also nice to have a

Cython module, and version of these functions that return C arrays, in order to

avoid these operations when they are not necessary.

**Memory cost** : The methods implemented in the current module sometimes need large

amounts of memory to return their result. Storing the distances between all

pairs of vertices in a graph on `1500` vertices as a dictionary of dictionaries

takes around 200MB, while storing the same information as a C array requires

4MB.

The module's main function

--------------------------

The C function ``all_pairs_shortest_path_BFS`` actually does all the

computations, and all the others (except for ``Floyd_Warshall``) are just

wrapping it. This function begins with copying the graph in a data structure

that makes it fast to query the out-neighbors of a vertex, then starts one

Breadth First Search per vertex of the (di)graph.

**What can this function compute ?**

- The matrix of predecessors.

This matrix `P` has size `n^2`, and is such that vertex `P[u,v]` is a

predecessor of `v` on a shortest `uv`-path. Hence, this matrix efficiently

encodes the information of a shortest `uv`-path for any `u,v\in G` :

indeed, to go from `u` to `v` you should first find a shortest

`uP[u,v]`-path, then jump from `P[u,v]` to `v` as it is one of its

outneighbors. Apply recursively and find out what the whole path is !.

- The matrix of distances.

This matrix has size `n^2` and associates to any `uv` the distance from

`u` to `v`.

- The vector of eccentricities.

This vector of size `n` encodes for each vertex `v` the distance to vertex

which is furthest from `v` in the graph. In particular, the diameter of

the graph is the maximum of these values.

**What does it take as input ?**

- ``gg`` a (Di)Graph.

- ``unsigned short * predecessors`` -- a pointer toward an array of size

`n^2\cdot\text{sizeof(unsigned short)}`. Set to ``NULL`` if you do not

want to compute the predecessors.

- ``unsigned short * distances`` -- a pointer toward an array of size

`n^2\cdot\text{sizeof(unsigned short)}`. The computation of the distances

is necessary for the algorithm, so this value can **not** be set to

``NULL``.

- ``int * eccentricity`` -- a pointer toward an array of size

`n\cdot\text{sizeof(int)}`. Set to ``NULL`` if you do not want to compute

the eccentricity.

**Technical details**

- The vertices are encoded as `1, ..., n` as they appear in the ordering of

``G.vertices()``.

- Because this function works on matrices whose size is quadratic compared

to the number of vertices when computing all distances or predecessors, it

uses short variables to store the vertices' names instead of long ones to

divide by 2 the size in memory. This means that only the

diameter/eccentricities can be computed on a graph of more than 65536

nodes. For information, the current version of the algorithm on a graph

with `65536=2^{16}` nodes creates in memory `2` tables on `2^{32}` short

elements (2bytes each), for a total of `2^{33}` bytes or `8` gigabytes. In

order to support larger sizes, we would have to replace shorts by 32-bits

int or 64-bits int, which would then require respectively 16GB or 32GB.

- In the C version of these functions, infinite distances are represented

with ``<unsigned short> -1 = 65535`` for ``unsigned short`` variables, and

by ``INT32_MAX`` otherwise. These case happens when the input is a

disconnected graph, or a non-strongly-connected digraph.

- A memory error is raised when data structures allocation failed. This

could happen with large graphs on computers with low memory space.

.. WARNING::

The function ``all_pairs_shortest_path_BFS`` has **no reason** to be

called by the user, even though he would be writing his code in Cython

and look for efficiency. This module contains wrappers for this function

that feed it with the good parameters. As the function is inlined, using

those wrappers actually saves time as it should avoid testing the

parameters again and again in the main function's body.

AUTHOR:

- Nathann Cohen (2011)

- David Coudert (2014) -- 2sweep, multi-sweep and iFUB for diameter computation

REFERENCE:

.. [KRG96b] \S. Klavzar, A. Rajapakse, and I. Gutman. The Szeged and the

Wiener index of graphs. *Applied Mathematics Letters*, 9(5):45--49, 1996.

.. [GYLL93c] \I. Gutman, Y.-N. Yeh, S.-L. Lee, and Y.-L. Luo. Some recent

results in the theory of the Wiener number. *Indian Journal of

Chemistry*, 32A:651--661, 1993.

.. [CGH+13] \P. Crescenzi, R. Grossi, M. Habib, L. Lanzi, A. Marino. On computing

the diameter of real-world undirected graphs. *Theor. Comput. Sci.* 514: 84-95

(2013) :doi:`10.1016/j.tcs.2012.09.018`

.. [CGI+10] \P. Crescenzi, R. Grossi, C. Imbrenda, L. Lanzi, and A. Marino.

Finding the Diameter in Real-World Graphs: Experimentally Turning a Lower

Bound into an Upper Bound. Proceedings of *18th Annual European Symposium on

Algorithms*. Lecture Notes in Computer Science, vol. 6346, 302-313. Springer

(2010).

.. [MLH08] \C. Magnien, M. Latapy, and M. Habib. Fast computation of empirically

tight bounds for the diameter of massive graphs. *ACM Journal of Experimental

Algorithms* 13 (2008) http://dx.doi.org/10.1145/1412228.1455266

.. [TK13] \F. W. Takes and W. A. Kosters. Computing the eccentricity distribution

of large graphs. *Algorithms* 6:100-118 (2013)

http://dx.doi.org/10.3390/a6010100

Functions

---------

"""

#*****************************************************************************

# This program is free software: you can redistribute it and/or modify

# it under the terms of the GNU General Public License as published by

# the Free Software Foundation, either version 2 of the License, or

# (at your option) any later version.

# http://www.gnu.org/licenses/

#*****************************************************************************

from __future__ import print_function

include "sage/data_structures/binary_matrix.pxi"

from libc.string cimport memset

from libc.stdint cimport uint64_t, uint32_t, INT32_MAX, UINT32_MAX

from cysignals.memory cimport sig_malloc, sig_calloc, sig_free

from cysignals.signals cimport sig_on, sig_off

from sage.graphs.base.c_graph cimport CGraphBackend

from sage.graphs.base.c_graph cimport CGraph

from sage.ext.memory_allocator cimport MemoryAllocator

from sage.graphs.base.static_sparse_graph cimport short_digraph, init_short_digraph, free_short_digraph, out_degree

cdef inline all_pairs_shortest_path_BFS(gg,

unsigned short * predecessors,

unsigned short * distances,

uint32_t * eccentricity):

"""

See the module's documentation.

"""

from sage.rings.infinity import Infinity

cdef list int_to_vertex = gg.vertices()

cdef int i

cdef MemoryAllocator mem = MemoryAllocator()

cdef int n = len(int_to_vertex)

# Computing the predecessors/distances can only be done if we have less than

# MAX_UNSIGNED_SHORT vertices. No problem with the eccentricities though as

# we store them on an integer vector.

if (predecessors != NULL or distances != NULL) and n > <unsigned short> -1:

raise ValueError("The graph backend contains more than "+

str(<unsigned short> -1)+" nodes and we cannot "+

"compute the matrix of distances/predecessors on "+

"something like that !")

# The vertices which have already been visited

cdef bitset_t seen

bitset_init(seen, n)

# The list of waiting vertices, the beginning and the end of the list

cdef int * waiting_list = <int *> mem.allocarray(n, sizeof(int))

cdef int waiting_beginning = 0

cdef int waiting_end = 0

cdef int source

cdef int v, u

cdef uint32_t * p_tmp

cdef uint32_t * end

cdef unsigned short * c_predecessors = predecessors

cdef int * c_distances = <int *> mem.allocarray(n, sizeof(int))

# Copying the whole graph to obtain the list of neighbors quicker than by

# calling out_neighbors

# The edges are stored in the vector p_edges. This vector contains, from

# left to right The list of the first vertex's outneighbors, then the

# second's, then the third's, ...

# The outneighbors of vertex i are enumerated from

# p_vertices[i] to p_vertices[i+1] - 1

# (if p_vertices[i] is equal to p_vertices[i+1], then i has no outneighbours)

# This data structure is well documented in the module

# sage.graphs.base.static_sparse_graph

cdef short_digraph sd

init_short_digraph(sd, gg)

cdef uint32_t ** p_vertices = sd.neighbors

cdef uint32_t * p_edges = sd.edges

cdef uint32_t * p_next = p_edges

# We run n different BFS taking each vertex as a source

for source in range(n):

# The source is seen

bitset_set_first_n(seen, 0)

bitset_add(seen, source)

# Its parameters can already be set

c_distances[source] = 0

if predecessors != NULL:

c_predecessors[source] = source

# and added to the queue

waiting_list[0] = source

waiting_beginning = 0

waiting_end = 0

# For as long as there are vertices left to explore

while waiting_beginning <= waiting_end:

# We pick the first one

v = waiting_list[waiting_beginning]

p_tmp = p_vertices[v]

end = p_vertices[v+1]

# Iterating over all the outneighbors u of v

while p_tmp < end:

u = p_tmp[0]

# If we notice one of these neighbors is not seen yet, we set

# its parameters and add it to the queue to be explored later.

if not bitset_in(seen, u):

c_distances[u] = c_distances[v]+1

if predecessors != NULL:

c_predecessors[u] = v

bitset_add(seen, u)

waiting_end += 1

waiting_list[waiting_end] = u

p_tmp += 1

waiting_beginning += 1

# If not all the vertices have been met

if bitset_len(seen) < n:

bitset_complement(seen, seen)

v = bitset_next(seen, 0)

while v >= 0:

c_distances[v] = INT32_MAX

if predecessors != NULL:

c_predecessors[v] = -1

v = bitset_next(seen, v+1)

if eccentricity != NULL:

eccentricity[source] = UINT32_MAX

elif eccentricity != NULL:

eccentricity[source] = c_distances[waiting_list[n-1]]

if predecessors != NULL:

c_predecessors += n

if distances != NULL:

for i in range(n):

distances[i] = <unsigned short> c_distances[i]

distances += n

bitset_free(seen)

free_short_digraph(sd)

################

# Predecessors #

################

cdef unsigned short * c_shortest_path_all_pairs(G) except NULL:

r"""

Returns the matrix of predecessors in G.

The matrix `P` returned has size `n^2`, and is such that vertex `P[u,v]` is

a predecessor of `v` on a shortest `uv`-path. Hence, this matrix efficiently

encodes the information of a shortest `uv`-path for any `u,v\in G` : indeed,

to go from `u` to `v` you should first find a shortest `uP[u,v]`-path, then

jump from `P[u,v]` to `v` as it is one of its outneighbors.

"""

cdef unsigned int n = G.order()

cdef unsigned short * distances = <unsigned short *> sig_malloc(n*n*sizeof(unsigned short))

if distances == NULL:

raise MemoryError()

cdef unsigned short * predecessors = <unsigned short *> sig_malloc(n*n*sizeof(unsigned short))

if predecessors == NULL:

sig_free(distances)

raise MemoryError()

all_pairs_shortest_path_BFS(G, predecessors, distances, NULL)

sig_free(distances)

return predecessors

def shortest_path_all_pairs(G):

r"""

Returns the matrix of predecessors in G.

The matrix `P` returned has size `n^2`, and is such that vertex `P[u,v]` is

a predecessor of `v` on a shortest `uv`-path. Hence, this matrix efficiently

encodes the information of a shortest `uv`-path for any `u,v\in G` : indeed,

to go from `u` to `v` you should first find a shortest `uP[u,v]`-path, then

jump from `P[u,v]` to `v` as it is one of its outneighbors.

The integer corresponding to a vertex is its index in the list

``G.vertices()``.

EXAMPLES::

sage: from sage.graphs.distances_all_pairs import shortest_path_all_pairs

sage: g = graphs.PetersenGraph()

sage: shortest_path_all_pairs(g)

{0: {0: None, 1: 0, 2: 1, 3: 4, 4: 0, 5: 0, 6: 1, 7: 5, 8: 5, 9: 4},

1: {0: 1, 1: None, 2: 1, 3: 2, 4: 0, 5: 0, 6: 1, 7: 2, 8: 6, 9: 6},

2: {0: 1, 1: 2, 2: None, 3: 2, 4: 3, 5: 7, 6: 1, 7: 2, 8: 3, 9: 7},

3: {0: 4, 1: 2, 2: 3, 3: None, 4: 3, 5: 8, 6: 8, 7: 2, 8: 3, 9: 4},

4: {0: 4, 1: 0, 2: 3, 3: 4, 4: None, 5: 0, 6: 9, 7: 9, 8: 3, 9: 4},

5: {0: 5, 1: 0, 2: 7, 3: 8, 4: 0, 5: None, 6: 8, 7: 5, 8: 5, 9: 7},

6: {0: 1, 1: 6, 2: 1, 3: 8, 4: 9, 5: 8, 6: None, 7: 9, 8: 6, 9: 6},

7: {0: 5, 1: 2, 2: 7, 3: 2, 4: 9, 5: 7, 6: 9, 7: None, 8: 5, 9: 7},

8: {0: 5, 1: 6, 2: 3, 3: 8, 4: 3, 5: 8, 6: 8, 7: 5, 8: None, 9: 6},

9: {0: 4, 1: 6, 2: 7, 3: 4, 4: 9, 5: 7, 6: 9, 7: 9, 8: 6, 9: None}}

"""

cdef int n = G.order()

if n == 0:

return {}

cdef unsigned short * predecessors = c_shortest_path_all_pairs(G)

cdef unsigned short * c_predecessors = predecessors

cdef dict d = {}

cdef dict d_tmp

cdef CGraphBackend cg = <CGraphBackend> G._backend

cdef list int_to_vertex = G.vertices()

cdef int i, j

for i, l in enumerate(int_to_vertex):

int_to_vertex[i] = cg.get_vertex(l)

for j in range(n):

d_tmp = {}

for i in range(n):

if c_predecessors[i] == <unsigned short> -1:

d_tmp[int_to_vertex[i]] = None

else:

d_tmp[int_to_vertex[i]] = int_to_vertex[c_predecessors[i]]

d_tmp[int_to_vertex[j]] = None

d[int_to_vertex[j]] = d_tmp

c_predecessors += n

sig_free(predecessors)

return d

#############

# Distances #

#############

cdef unsigned short * c_distances_all_pairs(G):

r"""

Returns the matrix of distances in G.

The matrix `M` returned is of length `n^2`, and the distance between

vertices `u` and `v` is `M[u,v]`. The integer corresponding to a vertex is

its index in the list ``G.vertices()``.

"""

cdef unsigned int n = G.order()

cdef unsigned short * distances = <unsigned short *> sig_malloc(n*n*sizeof(unsigned short))

if distances == NULL:

raise MemoryError()

all_pairs_shortest_path_BFS(G, NULL, distances, NULL)

return distances

def distances_all_pairs(G):

r"""

Returns the matrix of distances in G.

This function returns a double dictionary ``D`` of vertices, in which the

distance between vertices ``u`` and ``v`` is ``D[u][v]``.

EXAMPLES::

sage: from sage.graphs.distances_all_pairs import distances_all_pairs

sage: g = graphs.PetersenGraph()

sage: distances_all_pairs(g)

{0: {0: 0, 1: 1, 2: 2, 3: 2, 4: 1, 5: 1, 6: 2, 7: 2, 8: 2, 9: 2},

1: {0: 1, 1: 0, 2: 1, 3: 2, 4: 2, 5: 2, 6: 1, 7: 2, 8: 2, 9: 2},

2: {0: 2, 1: 1, 2: 0, 3: 1, 4: 2, 5: 2, 6: 2, 7: 1, 8: 2, 9: 2},

3: {0: 2, 1: 2, 2: 1, 3: 0, 4: 1, 5: 2, 6: 2, 7: 2, 8: 1, 9: 2},

4: {0: 1, 1: 2, 2: 2, 3: 1, 4: 0, 5: 2, 6: 2, 7: 2, 8: 2, 9: 1},

5: {0: 1, 1: 2, 2: 2, 3: 2, 4: 2, 5: 0, 6: 2, 7: 1, 8: 1, 9: 2},

6: {0: 2, 1: 1, 2: 2, 3: 2, 4: 2, 5: 2, 6: 0, 7: 2, 8: 1, 9: 1},

7: {0: 2, 1: 2, 2: 1, 3: 2, 4: 2, 5: 1, 6: 2, 7: 0, 8: 2, 9: 1},

8: {0: 2, 1: 2, 2: 2, 3: 1, 4: 2, 5: 1, 6: 1, 7: 2, 8: 0, 9: 2},

9: {0: 2, 1: 2, 2: 2, 3: 2, 4: 1, 5: 2, 6: 1, 7: 1, 8: 2, 9: 0}}

"""

from sage.rings.infinity import Infinity

cdef int n = G.order()

if n == 0:

return {}

cdef unsigned short * distances = c_distances_all_pairs(G)

cdef unsigned short * c_distances = distances

cdef dict d = {}

cdef dict d_tmp

cdef list int_to_vertex = G.vertices()

cdef int i, j

for j in range(n):

d_tmp = {}

for i in range(n):

if c_distances[i] == <unsigned short> -1:

d_tmp[int_to_vertex[i]] = Infinity

else:

d_tmp[int_to_vertex[i]] = c_distances[i]

d[int_to_vertex[j]] = d_tmp

c_distances += n

sig_free(distances)

return d

def is_distance_regular(G, parameters = False):

r"""

Tests if the graph is distance-regular

A graph `G` is distance-regular if for any integers `j,k` the value

of `|\{x:d_G(x,u)=j,x\in V(G)\} \cap \{y:d_G(y,v)=j,y\in V(G)\}|` is constant

for any two vertices `u,v\in V(G)` at distance `i` from each other.

In particular `G` is regular, of degree `b_0` (see below), as one can take `u=v`.

Equivalently a graph is distance-regular if there exist integers `b_i,c_i`

such that for any two vertices `u,v` at distance `i` we have

* `b_i = |\{x:d_G(x,u)=i+1,x\in V(G)\}\cap N_G(v)\}|, \ 0\leq i\leq d-1`

* `c_i = |\{x:d_G(x,u)=i-1,x\in V(G)\}\cap N_G(v)\}|, \ 1\leq i\leq d,`

where `d` is the diameter of the graph. For more information on

distance-regular graphs, see its associated :wikipedia:`wikipedia

page <Distance-regular_graph>`.

INPUT:

- ``parameters`` (boolean) -- if set to ``True``, the function returns the

pair ``(b,c)`` of lists of integers instead of ``True`` (see the definition

above). Set to ``False`` by default.

.. SEEALSO::

* :meth:`~sage.graphs.generic_graph.GenericGraph.is_regular`

* :meth:`~Graph.is_strongly_regular`

EXAMPLES::

sage: g = graphs.PetersenGraph()

sage: g.is_distance_regular()

True

sage: g.is_distance_regular(parameters = True)

([3, 2, None], [None, 1, 1])

Cube graphs, which are not strongly regular, are a bit more interesting::

sage: graphs.CubeGraph(4).is_distance_regular()

True

sage: graphs.OddGraph(5).is_distance_regular()

True

Disconnected graph::

sage: (2*graphs.CubeGraph(4)).is_distance_regular()

True

TESTS::

sage: graphs.PathGraph(2).is_distance_regular(parameters = True)

([1, None], [None, 1])

sage: graphs.Tutte12Cage().is_distance_regular(parameters=True)

([3, 2, 2, 2, 2, 2, None], [None, 1, 1, 1, 1, 1, 3])

"""

cdef int i,l,u,v,d,b,c,k

cdef int n = G.order()

cdef int infinity = <unsigned short> -1

if n <= 1:

return ([],[]) if parameters else True

if not G.is_regular():

return False

k = G.degree(next(G.vertex_iterator()))

# Matrix of distances

cdef unsigned short * distance_matrix = c_distances_all_pairs(G)

# The diameter, i.e. the longest *finite* distance between two vertices

cdef int diameter = 0

for i in range(n*n):

if distance_matrix[i] > diameter and distance_matrix[i] != infinity:

diameter = distance_matrix[i]

cdef bitset_t b_tmp

bitset_init(b_tmp, n)

# b_distance_matrix[d*n+v] is the set of vertices at distance d from v.

cdef binary_matrix_t b_distance_matrix

try:

binary_matrix_init(b_distance_matrix,n*(diameter+2),n)

except MemoryError:

sig_free(distance_matrix)

bitset_free(b_tmp)

raise

# Fills b_distance_matrix

for u in range(n):

for v in range(u,n):

d = distance_matrix[u*n+v]

if d != infinity:

binary_matrix_set1(b_distance_matrix, d*n+u, v)

binary_matrix_set1(b_distance_matrix, d*n+v, u)

cdef list bi = [-1 for i in range(diameter +1)]

cdef list ci = [-1 for i in range(diameter +1)]

# Applying the definition with b_i,c_i

for u in range(n):

for v in range(n):

if u == v:

continue

d = distance_matrix[u*n+v]

if d == infinity:

continue

# Computations of b_d and c_d for u,v. We intersect sets stored in

# b_distance_matrix.

bitset_and(b_tmp, b_distance_matrix.rows[(d+1)*n+u], b_distance_matrix.rows[n+v])

b = bitset_len(b_tmp)

bitset_and(b_tmp, b_distance_matrix.rows[(d-1)*n+u], b_distance_matrix.rows[n+v])

c = bitset_len(b_tmp)

# Consistency of b_d and c_d

if bi[d] == -1:

bi[d] = b

ci[d] = c

elif bi[d] != b or ci[d] != c:

sig_free(distance_matrix)

binary_matrix_free(b_distance_matrix)

bitset_free(b_tmp)

return False

sig_free(distance_matrix)

binary_matrix_free(b_distance_matrix)

bitset_free(b_tmp)

if parameters:

bi[0] = k

bi[diameter] = None

ci[0] = None

return bi, ci

else:

return True

###################################

# Both distances and predecessors #

###################################

def distances_and_predecessors_all_pairs(G):

r"""

Returns the matrix of distances in G and the matrix of predecessors.

Distances : the matrix `M` returned is of length `n^2`, and the distance

between vertices `u` and `v` is `M[u,v]`. The integer corresponding to a

vertex is its index in the list ``G.vertices()``.

Predecessors : the matrix `P` returned has size `n^2`, and is such that

vertex `P[u,v]` is a predecessor of `v` on a shortest `uv`-path. Hence, this

matrix efficiently encodes the information of a shortest `uv`-path for any

`u,v\in G` : indeed, to go from `u` to `v` you should first find a shortest

`uP[u,v]`-path, then jump from `P[u,v]` to `v` as it is one of its

outneighbors.

The integer corresponding to a vertex is its index in the list

``G.vertices()``.

EXAMPLES::

sage: from sage.graphs.distances_all_pairs import distances_and_predecessors_all_pairs

sage: g = graphs.PetersenGraph()

sage: distances_and_predecessors_all_pairs(g)

({0: {0: 0, 1: 1, 2: 2, 3: 2, 4: 1, 5: 1, 6: 2, 7: 2, 8: 2, 9: 2},