|
|||||||||
PREV NEXT | FRAMES NO FRAMES |
See:
Description
Packages | |
---|---|
it.unimi.dsi.webgraph | Main classes implementing the WebGraph algorithms. |
it.unimi.dsi.webgraph.algo | Classes implementing useful algorithms on graphs. |
it.unimi.dsi.webgraph.examples | Example classes that do nice things using the WebGraph framework. |
it.unimi.dsi.webgraph.labelling | Main classes implementing labelling for immutable graphs. |
WebGraph is a framework to study the web graph. It provides simple ways to manage very large graphs, exploiting modern compression techniques. More precisely, it is currently made of:
In the end, with WebGraph you can access and analyse very large web graphs. Using WebGraph is as easy as installing a few jar files and downloading a data set. This makes studying phenomena such as PageRank, distribution of graph properties of the web graph, etc., very easy.
You are welcome to use and improve WebGraph! If you find our software useful for your research, please quote our paper “The WebGraph Framework I: Compression Techniques”, by Paolo Boldi and Sebastiano Vigna, in Proc. of the Thirteenth World–Wide Web Conference, pages 595−601, 2004, ACM Press.
Warning: WebGraph 2+ is not fully compatible with previous versions and
requires some minor code refactoring. Please
refer to the documentation of ImmutableGraph
.
For in-depth information on the Webgraph framework, you should have a look at its home page, where you can find some papers about the compression techniques it uses. Datasets are available at the LAW web site.
The classes of interest for the casual Webgraph user are ImmutableGraph
, which specifies the access
methods for an immutable graph, BVGraph
,
which allow to retrieve or recompress a graph stored in the format
described in The WebGraph
Framework I: Compression Techniques, and Transform
, which
provides several ways to transform an ImmutableGraph
.
If you plan on building your graphs dynamically, the class
ArrayListMutableGraph
makes it possible
to create incrementally a graph and then extract an immutable view.
The package it.unimi.dsi.webgraph.examples
contains useful
examples that show how to access sequentially and randomly an immutable
graph.
If you want to import your own data into WebGraph, you must write
an implementation of ImmutableGraph
that
exposes your data. A simple example is given in IntegerListImmutableGraph
,
a stub class exposing a simple, noncompressed binary format as an ImmutableGraph
.
Once your data is exposed in that way, you can get a compressed version
using the store()
method of your class of interest. Often, there
is a main method (see, e.g., BVGraph
) that
will load your class and invoke store()
for you.
As an alternative, the class ASCIIGraph
can be used to read graphs specified in a very simple ASCII format. The class
implements ASCIIGraph.loadOnce(java.io.InputStream)
so
that the file can be just piped into a class offering a main method that supports
loadOnce()
(e.g., BVGraph
).
You can also generate a graph in ASCII format and read it using
ASCIIGraph.loadOffline(CharSequence)
—the
graph will not be loaded into main memory.
ASCIIGraph
requires listing the successors of each
node on a separate line. If your graph is specified arc by arc (one arc per line) you
can use ArcListASCIIGraph
instead.
ShiftedByOneArcListASCIIGraph
can be used if your input
data numbers (rather insensibly) nodes starting from one.
Arc-labelled graphs are represented using implementations of ArcLabelledImmutableGraph
.
Most arc-labelled graphs are based on an underlying ImmutableGraph
, and
the ArcLabelledImmutableGraph
implementation just provides
label handling. The example IntegerTriplesArcLabelledImmutableGraph
shows how to expose your data as an instance of ArcLabelledImmutableGraph
,
so you can save your data using your preferred combination of implementations.
WebGraph requires Java ≥6 and relies on fastutil 6.4 or greater for high-performance containers and algorithms, on the COLT distribution for statistics, on the DSI utilities for bit-level I/O, on Sux4J for succinct data structures, on JSAP for line-command parsing and on log4j for logging.
Note that in principle the DSI utilities depend on a number of additional useful libraries from the Jakarta commons project, including collections, lang, configuration and io.
|
|||||||||
PREV NEXT | FRAMES NO FRAMES |