|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectit.unimi.dsi.webgraph.Transform
public class Transform
Static methods that manipulate immutable graphs.
Most methods take an ImmutableGraph
(along with some other data, that
depend on the kind of transformation), and return another ImmutableGraph
that represents the transformed
version.
Nested Class Summary | |
---|---|
static interface |
Transform.ArcFilter
Provides a method to accept or reject an arc. |
static class |
Transform.BatchGraph
|
static interface |
Transform.LabelledArcFilter
Provides a method to accept or reject a labelled arc. |
static class |
Transform.LowerBound
An arc filter that rejects arcs whose well-known attribute has a value smaller than a given threshold. |
static class |
Transform.NodeClassFilter
An arc filter that only accepts arcs whose endpoints belong to the same (if the parameter keepOnlySame is true) or to
different (if keepOnlySame is false) classes. |
Field Summary | |
---|---|
static it.unimi.dsi.webgraph.Transform.NoLoops |
NO_LOOPS
A singleton providing an arc filter that rejects loops. |
Method Summary | |
---|---|
static ArcLabelledImmutableGraph |
compose(ArcLabelledImmutableGraph g0,
ArcLabelledImmutableGraph g1,
LabelSemiring strategy)
Returns the composition (a.k.a. |
static ImmutableGraph |
compose(ImmutableGraph g0,
ImmutableGraph g1)
Returns the composition (a.k.a. |
protected static boolean |
ensureNumArgs(String[] param,
int n)
Ensures that the arguments are exactly n , if n is nonnegative, or
at least -n , otherwise. |
static ArcLabelledImmutableGraph |
filterArcs(ArcLabelledImmutableGraph graph,
Transform.LabelledArcFilter filter)
Returns a labelled graph with some arcs eventually stripped, according to the given filter. |
static ArcLabelledImmutableGraph |
filterArcs(ArcLabelledImmutableGraph graph,
Transform.LabelledArcFilter filter,
ProgressLogger ignored)
Returns a labelled graph with some arcs eventually stripped, according to the given filter. |
static ImmutableGraph |
filterArcs(ImmutableGraph graph,
Transform.ArcFilter filter)
Returns a graph with some arcs eventually stripped, according to the given filter. |
static ImmutableGraph |
filterArcs(ImmutableGraph graph,
Transform.ArcFilter filter,
ProgressLogger ignored)
Returns a graph with some arcs eventually stripped, according to the given filter. |
static int[] |
grayCodePermutation(ImmutableGraph g)
Returns a permutation that would make the given graph adjacency lists in Gray-code order. |
static int[] |
hostByHostGrayCodePermutation(ImmutableGraph g,
int[] hostMap,
boolean strict)
Returns a permutation that would make the given graph adjacency lists in host-by-host Gray-code order. |
static int[] |
lexicographicalPermutation(ImmutableGraph g)
Returns a permutation that would make the given graph adjacency lists in lexicographical order. |
static ImmutableSequentialGraph |
line(ImmutableGraph g,
String mapBasename,
File tempDir,
int batchSize,
ProgressLogger pl)
Computes the line graph of a given symmetric graph. |
protected static ImmutableGraph |
load(Class<?> graphClass,
String baseName,
boolean sequential,
boolean offline,
ProgressLogger pl)
Loads a graph with given data and returns it. |
protected static void |
logBatches(ObjectArrayList<File> batches,
long pairs,
ProgressLogger pl)
|
static void |
main(String[] args)
|
static ImmutableGraph |
map(ImmutableGraph g,
int[] f)
Remaps the the graph nodes through a function specified via an array. |
static ImmutableGraph |
map(ImmutableGraph g,
int[] map,
ProgressLogger pl)
Remaps the the graph nodes through a partial function specified via an array. |
static ImmutableSequentialGraph |
mapOffline(ImmutableGraph g,
int[] map,
int batchSize)
Returns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via an array. |
static ImmutableSequentialGraph |
mapOffline(ImmutableGraph g,
int[] map,
int batchSize,
File tempDir)
Returns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via an array. |
static ImmutableSequentialGraph |
mapOffline(ImmutableGraph g,
int[] map,
int batchSize,
File tempDir,
ProgressLogger pl)
Returns an immutable graph obtained by remapping offline the graph nodes through a partial function specified via an array. |
static int |
processBatch(int n,
int[] source,
int[] target,
File tempDir,
List<File> batches)
Sorts the given source and target arrays w.r.t. |
static int[] |
randomPermutation(ImmutableGraph g,
long seed)
Returns a random permutation for a given graph. |
static ImmutableGraph |
symmetrize(ImmutableGraph g)
Returns a symmetrized graph. |
static ImmutableGraph |
symmetrize(ImmutableGraph g,
ImmutableGraph t)
Returns a symmetrized graph. |
static ImmutableGraph |
symmetrize(ImmutableGraph g,
ImmutableGraph t,
ProgressLogger pl)
Returns a symmetrized graph. |
static ImmutableGraph |
symmetrize(ImmutableGraph g,
ProgressLogger pl)
Returns a symmetrized graph. |
static ImmutableGraph |
symmetrizeOffline(ImmutableGraph g,
int batchSize)
Returns a symmetrized graph using an offline transposition. |
static ImmutableGraph |
symmetrizeOffline(ImmutableGraph g,
int batchSize,
File tempDir)
Returns a symmetrized graph using an offline transposition. |
static ImmutableGraph |
symmetrizeOffline(ImmutableGraph g,
int batchSize,
File tempDir,
ProgressLogger pl)
Returns a symmetrized graph using an offline transposition. |
static ImmutableGraph |
transpose(ImmutableGraph g)
Returns an immutable graph obtained by reversing all arcs in g . |
static ImmutableGraph |
transpose(ImmutableGraph g,
ProgressLogger pl)
Returns an immutable graph obtained by reversing all arcs in g . |
static ArcLabelledImmutableGraph |
transposeOffline(ArcLabelledImmutableGraph g,
int batchSize)
Returns an arc-labelled immutable graph obtained by reversing all arcs in g , using an offline method. |
static ArcLabelledImmutableGraph |
transposeOffline(ArcLabelledImmutableGraph g,
int batchSize,
File tempDir)
Returns an arc-labelled immutable graph obtained by reversing all arcs in g , using an offline method. |
static ArcLabelledImmutableGraph |
transposeOffline(ArcLabelledImmutableGraph g,
int batchSize,
File tempDir,
ProgressLogger pl)
Returns an arc-labelled immutable graph obtained by reversing all arcs in g , using an offline method. |
static ImmutableSequentialGraph |
transposeOffline(ImmutableGraph g,
int batchSize)
Returns an immutable graph obtained by reversing all arcs in g , using an offline method. |
static ImmutableSequentialGraph |
transposeOffline(ImmutableGraph g,
int batchSize,
File tempDir)
Returns an immutable graph obtained by reversing all arcs in g , using an offline method. |
static ImmutableSequentialGraph |
transposeOffline(ImmutableGraph g,
int batchSize,
File tempDir,
ProgressLogger pl)
Returns an immutable graph obtained by reversing all arcs in g , using an offline method. |
static ArcLabelledImmutableGraph |
union(ArcLabelledImmutableGraph g0,
ArcLabelledImmutableGraph g1,
LabelMergeStrategy labelMergeStrategy)
Returns the union of two arc-labelled immutable graphs. |
static ImmutableGraph |
union(ImmutableGraph g0,
ImmutableGraph g1)
Returns the union of two immutable graphs. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final it.unimi.dsi.webgraph.Transform.NoLoops NO_LOOPS
Method Detail |
---|
public static ImmutableGraph filterArcs(ImmutableGraph graph, Transform.ArcFilter filter)
graph
- a graph.filter
- the filter (telling whether each arc should be kept or not).
public static ImmutableGraph filterArcs(ImmutableGraph graph, Transform.ArcFilter filter, ProgressLogger ignored)
graph
- a graph.filter
- the filter (telling whether each arc should be kept or not).ignored
- a progress logger.
public static ArcLabelledImmutableGraph filterArcs(ArcLabelledImmutableGraph graph, Transform.LabelledArcFilter filter)
graph
- a labelled graph.filter
- the filter (telling whether each arc should be kept or not).
public static ArcLabelledImmutableGraph filterArcs(ArcLabelledImmutableGraph graph, Transform.LabelledArcFilter filter, ProgressLogger ignored)
graph
- a labelled graph.filter
- the filter (telling whether each arc should be kept or not).ignored
- a progress logger.
public static ImmutableGraph map(ImmutableGraph g, int[] map, ProgressLogger pl)
map.length=g.numNodes()
,
and map[i]
is the new name of node i
, or -1 if the node
should not be mapped. If some
index appearing in map
is larger than or equal to the
number of nodes of g
, the resulting graph is enlarged correspondingly.
Arcs are mapped in the obvious way; in other words, there is
an arc from map[i]
to map[j]
(both nonnegative)
in the transformed
graph iff there was an arc from i
to j
in the original graph.
Note that if map
is bijective, the returned graph
is simply a permutation of the original graph.
Otherwise, the returned graph is obtained by deleting nodes mapped
to -1, quotienting nodes w.r.t. the equivalence relation induced by the fibres of map
and renumbering the result, always according to map
.
This method requires ImmutableGraph.randomAccess() random access.
g
- the graph to be transformed.map
- the transformation map.pl
- a progress logger to be used during the precomputation, or null
.
public static ImmutableGraph map(ImmutableGraph g, int[] f)
g
- the graph to be transformed.f
- the transformation map.
map(ImmutableGraph, int[], ProgressLogger)
public static ImmutableGraph symmetrizeOffline(ImmutableGraph g, int batchSize) throws IOException
g
- the source graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.
IOException
symmetrizeOffline(ImmutableGraph, int, File, ProgressLogger)
public static ImmutableGraph symmetrizeOffline(ImmutableGraph g, int batchSize, File tempDir) throws IOException
g
- the source graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
- a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.
IOException
symmetrizeOffline(ImmutableGraph, int, File, ProgressLogger)
public static ImmutableGraph symmetrizeOffline(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOException
The symmetrized graph is the union of a graph and of its transpose. This method will
compute the transpose on the fly using transposeOffline(ArcLabelledImmutableGraph, int, File, ProgressLogger)
.
g
- the source graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
- a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.pl
- a progress logger, or null
.
IOException
public static ImmutableGraph symmetrize(ImmutableGraph g, ImmutableGraph t, ProgressLogger pl)
The symmetrized graph is the union of a graph and of its transpose. This method will use the provided transposed graph, if any, instead of computing it on the fly.
g
- the source graph.t
- the graph g
transposed; if null
, the transposed graph will be computed on the fly.pl
- a progress logger, or null
.
public static ImmutableGraph symmetrize(ImmutableGraph g, ImmutableGraph t)
The symmetrized graph is the union of a graph and of its transpose. This method will use the provided transposed graph, if any, instead of computing it on the fly.
g
- the source graph.t
- the graph g
transposed; if null
, the transposed graph will be computed on the fly.
public static ImmutableGraph symmetrize(ImmutableGraph g, ProgressLogger pl)
g
- the source graph.pl
- a progress logger.
symmetrize(ImmutableGraph, ImmutableGraph, ProgressLogger)
public static ImmutableGraph symmetrize(ImmutableGraph g)
g
- the source graph.
symmetrize(ImmutableGraph, ImmutableGraph, ProgressLogger)
public static ImmutableGraph transpose(ImmutableGraph g, ProgressLogger pl)
g
.
This method can process offline graphs).
g
- an immutable graph.pl
- a progress logger, or null
.
g
.public static int processBatch(int n, int[] source, int[] target, File tempDir, List<File> batches) throws IOException
n
- the index of the last element to be sorted (exclusive).source
- the source array.target
- the target array.tempDir
- a temporary directory where to store the sorted arrays, or null
batches
- a list of files to which the batch file will be added.
n
because duplicates are eliminated).
IOException
public static ImmutableSequentialGraph transposeOffline(ImmutableGraph g, int batchSize) throws IOException
g
, using an offline method.
g
- an immutable graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.
g
.
IOException
transposeOffline(ImmutableGraph, int, File, ProgressLogger)
public static ImmutableSequentialGraph transposeOffline(ImmutableGraph g, int batchSize, File tempDir) throws IOException
g
, using an offline method.
g
- an immutable graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
- a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.
g
.
IOException
transposeOffline(ImmutableGraph, int, File, ProgressLogger)
public static ImmutableSequentialGraph transposeOffline(ImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOException
g
, using an offline method.
This method should be used to transpose very large graph in case transpose(ImmutableGraph)
requires too much memory. It creates a number of sorted batches on disk containing arcs
represented by a pair of gap-compressed integers ordered by target
and returns an ImmutableGraph
that can be accessed only using a node iterator
. The node iterator
merges on the fly the batches, providing a transposed graph. The files are marked with
File.deleteOnExit()
, so they should disappear when the JVM exits. An additional safety-net
finaliser tries to delete the batches, too.
Note that each NodeIterator
returned by the transpose requires opening all batches at the same time.
The batches are closed when they are exhausted, so a complete scan of the graph closes them all. In any case,
another safety-net finaliser closes all files when the iterator is collected.
This method can process offline graphs.
g
- an immutable graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
- a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.pl
- a progress logger, or null
.
g
.
IOException
protected static void logBatches(ObjectArrayList<File> batches, long pairs, ProgressLogger pl)
public static ImmutableSequentialGraph mapOffline(ImmutableGraph g, int[] map, int batchSize) throws IOException
g
- an immutable graph.map
- the transformation map.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.
g
.
IOException
mapOffline(ImmutableGraph, int[], int, File, ProgressLogger)
public static ImmutableSequentialGraph mapOffline(ImmutableGraph g, int[] map, int batchSize, File tempDir) throws IOException
g
- an immutable graph.map
- the transformation map.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
- a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.
g
.
IOException
mapOffline(ImmutableGraph, int[], int, File, ProgressLogger)
public static ImmutableSequentialGraph mapOffline(ImmutableGraph g, int[] map, int batchSize, File tempDir, ProgressLogger pl) throws IOException
map(ImmutableGraph, int[], ProgressLogger)
for the semantics of this method and transpose(ImmutableGraph, ProgressLogger)
for
implementation and performance-related details.
g
- an immutable graph.map
- the transformation map.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method.tempDir
- a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.pl
- a progress logger, or null
.
g
.
IOException
public static ArcLabelledImmutableGraph transposeOffline(ArcLabelledImmutableGraph g, int batchSize) throws IOException
g
, using an offline method.
g
- an immutable graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method,
plus an additional FastByteArrayOutputStream
needed to store all the labels for a batch.
g
.
IOException
transposeOffline(ArcLabelledImmutableGraph, int, File, ProgressLogger)
public static ArcLabelledImmutableGraph transposeOffline(ArcLabelledImmutableGraph g, int batchSize, File tempDir) throws IOException
g
, using an offline method.
g
- an immutable graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method,
plus an additional FastByteArrayOutputStream
needed to store all the labels for a batch.tempDir
- a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.
g
.
IOException
transposeOffline(ArcLabelledImmutableGraph, int, File, ProgressLogger)
public static ArcLabelledImmutableGraph transposeOffline(ArcLabelledImmutableGraph g, int batchSize, File tempDir, ProgressLogger pl) throws IOException
g
, using an offline method.
This method should be used to transpose very large graph in case transpose(ImmutableGraph)
requires too much memory. It creates a number of sorted batches on disk containing arcs
represented by a pair of integers in DataInput
format ordered by target
and returns an ImmutableGraph
that can be accessed only using a node iterator
. The node iterator
merges on the fly the batches, providing a transposed graph. The files are marked with
File.deleteOnExit()
, so they should disappear when the JVM exits. An additional safety-net
finaliser tries to delete the batches, too. As far as labels are concerned, they are temporarily stored in
an in-memory bit stream, that is permuted when it is stored on the disk
Note that each NodeIterator
returned by the transpose requires opening all batches at the same time.
The batches are closed when they are exhausted, so a complete scan of the graph closes them all. In any case,
another safety-net finaliser closes all files when the iterator is collected.
This method can process offline graphs. Note that no method to transpose on-line arc-labelled graph is provided currently.
g
- an immutable graph.batchSize
- the number of integers in a batch; two arrays of integers of this size will be allocated by this method,
plus an additional FastByteArrayOutputStream
needed to store all the labels for a batch.tempDir
- a temporary directory for the batches, or null
for File.createTempFile(java.lang.String, java.lang.String)
's choice.pl
- a progress logger.
g
.
IOException
public static ImmutableGraph transpose(ImmutableGraph g)
g
.
This method can process offline graphs.
g
- an immutable graph.
g
.transpose(ImmutableGraph, ProgressLogger)
public static ArcLabelledImmutableGraph union(ArcLabelledImmutableGraph g0, ArcLabelledImmutableGraph g1, LabelMergeStrategy labelMergeStrategy)
The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
g0
- the first graph.g1
- the second graph.labelMergeStrategy
- the strategy used to merge labels when the same arc
is present in both graphs; if null
, Labels.KEEP_FIRST_MERGE_STRATEGY
is used.
public static ImmutableGraph union(ImmutableGraph g0, ImmutableGraph g1)
The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
g0
- the first graph.g1
- the second graph.
public static ImmutableGraph compose(ImmutableGraph g0, ImmutableGraph g1)
The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
g0
- the first graph.g1
- the second graph.
public static ArcLabelledImmutableGraph compose(ArcLabelledImmutableGraph g0, ArcLabelledImmutableGraph g1, LabelSemiring strategy)
The two arguments may differ in the number of nodes, in which case the resulting graph will be large as the larger graph.
g0
- the first graph.g1
- the second graph.strategy
- a label semiring.
public static ImmutableSequentialGraph line(ImmutableGraph g, String mapBasename, File tempDir, int batchSize, ProgressLogger pl) throws IOException
Two additional files are created, with names stemmed from mapBasename
; the i-th entries of the two files
identify the source and target node (in the original graph) corresponding the node i in the line graph.
g
- the graph (it must be symmetric and loopless).mapBasename
- the basename of two files that will, at the end, contain as many integers as the number of nodes in the line graph: the i-th
integer in the file mapBasename.source
will contain the source of the arc corresponding to the i-th
node in the line graph, and similarly mapBasename.target
will give the target.tempDir
- the temporary directory to be used.batchSize
- the size used for batches.pl
- the progress logger to be used.
g
.
IOException
public static int[] grayCodePermutation(ImmutableGraph g)
Gray codes list all sequences of n zeros and ones in such a way that adjacent lists differ by exactly one bit. If we assign to each row of the adjacency matrix of a graph its index as a Gray code, we obtain a permutation that will make similar lines nearer.
Note that since a graph permutation permutes both rows and columns, this transformation is not idempotent: the Gray-code permutation produced from a matrix that has been Gray-code sorted will not be, in general, the identity.
The important feature of Gray-code ordering is that it is completely endogenous (e.g., determined by the graph itself), contrarily to, say, lexicographic URL ordering (which relies on the knowledge of the URL associated to each node).
g
- an immutable graph.
map(ImmutableGraph, int[], ProgressLogger)
).public static int[] randomPermutation(ImmutableGraph g, long seed)
g
- an immutable graph.seed
- for XorShiftStarRandom
.
public static int[] hostByHostGrayCodePermutation(ImmutableGraph g, int[] hostMap, boolean strict)
This permutation differs from grayCodePermutation(ImmutableGraph)
in that Gray codes
are computed host by host. There are two variants, strict and loose. In the first case,
we restrict the adjacency matrix to the submatrix corresponding to a host and compute the ordering. In
the second case, we just restrict to the rows corresponding to a host, but then entire rows are used
to compute the ordering.
g
- an immutable graph.hostMap
- an array mapping each URL to its host (it is sufficient that each host is assigned a distinct number).strict
- if true, host-by-host Gray code computation will be strict, that is, the order is computed only
between columns of the same host of the rows.
map(ImmutableGraph, int[], ProgressLogger)
).public static int[] lexicographicalPermutation(ImmutableGraph g)
Note that since a graph permutation permutes both rows and columns, this transformation is not idempotent: the lexicographical permutation produced from a matrix that has been lexicographically sorted will not be, in general, the identity.
The important feature of lexicographical ordering is that it is completely endogenous (e.g., determined by the graph itself), contrarily to, say, lexicographic URL ordering (which relies on the knowledge of the URL associated to each node).
Warning: rows are numbered from zero from the left. This means, for instance, that nodes with an arc towards node zero are lexicographically smaller than nodes without it.
g
- an immutable graph.
map(ImmutableGraph, int[], ProgressLogger)
).protected static boolean ensureNumArgs(String[] param, int n)
n
, if n
is nonnegative, or
at least -n
, otherwise.
protected static ImmutableGraph load(Class<?> graphClass, String baseName, boolean sequential, boolean offline, ProgressLogger pl) throws IllegalArgumentException, SecurityException, IllegalAccessException, InvocationTargetException, NoSuchMethodException, IOException
graphClass
- the class of the graph to be loaded.baseName
- the graph basename.sequential
- whether the graph is to be loaded in a sequential fashion.offline
- whether the graph is to be loaded in an offline fashion.pl
- a progress logger.
IllegalArgumentException
SecurityException
IllegalAccessException
InvocationTargetException
NoSuchMethodException
IOException
public static void main(String[] args) throws IOException, IllegalArgumentException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException, ClassNotFoundException, JSAPException
IOException
IllegalArgumentException
SecurityException
InstantiationException
IllegalAccessException
InvocationTargetException
NoSuchMethodException
ClassNotFoundException
JSAPException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |