GraphML analyses graphs for the following measures:
ranked shortest paths
These calculations help your users understand ways to travel through (or ‘traverse’) a network.
The distance function measures how many hops apart two nodes are in a network. Shortest path highlights the route that passes through the lowest number of nodes.
Hops can also be weighted, meaning you can calculate actual distances, as well as the number of hops.
finding communities
Uses Louvain method for finding communities in large networks as described in [Blondel et. al, 2008]. The main concept is that of network modularity that assesses the quality of the current community partition. The algorithm works by successively improving the network's modularity by trying to change the community that each node belongs to. If there is no improvement in modularity this means that the best community partition has been found.
finding duplicates
Uses Double Metaphone phonetic encoding algorithm to find potentially duplicate entities.
Social Network Analysis (SNA)
closeness
This is the measure that helps you find the nodes that are closest to the other nodes in a network, based on their ability to reach them.
To calculate this, the algorithm finds the shortest path between each node, then assigns each node a score based on the sum of all the paths.
Nodes with a high closeness value have a lower distance to all other nodes. They’d be efficient broadcasters of information.
betweeness
Nodes with a high betweenness centrality score are the ones that most frequently act as ‘bridges’ between other nodes. They form the shortest pathways of communication within the network.
Usually this would indicate important gatekeepers of information between groups.
degree
The degree centrality measure finds nodes with the highest number of links to other nodes in the network.Nodes with a high degree centrality have the best connections to those around them – they might be influential, or just strategically well-placed.
-
Mandatory:
- host:
- Linux
- Windows (not tested but should work)
- target:
- Linux (services)
- Google Chrome web browser
- .NET Core SDK v5.0
- integrated development environment:
- Visual Studio Code (Linux or Windows)
- JetBrains Rider (Linux or Windows)
- Visual Studio (Windows)
- nodejs
- git
- Google Chrome web browser
- primary web client
- does not work with Apache ActiveMQ admin page
- Firefox web browser
- required to view Apache ActiveMQ admin page
- database:
- Microsoft SQL Server
- MySQL or MariaDB
- PostgreSQL
- SQLite (local development only)
- message queue:
- results store:
- host:
-
Optional
- Git Extensions (Windows)
- Docker (Windows)
- SwitchStartupProject for VS 2019 (Visual Studio)
- npm
- Redis Commander
- DBeaver
- DB Browser for SQLite
- SQLiteStudio
- Microsoft SQL Server Management Studio (Windows)
- ReportGenerator
- python
- Doxygen
- dot
Building
- clone repo
git clone https://github.com/TrevorDArcyEvans/GraphML.git
- build
dotnet restore
dotnet build
- run tests
dotnet test
- run code coverage
dotnet test /p:CollectCoverage=true /p:CoverletOutputFormat=opencover
- generate code coverage report
reportgenerator -reports:**/coverage.opencover.xml -targetdir:./CodeCoverage
- generate documentation
doxygen
open documentation
Back End
- run API
export ASPNETCORE_ENVIRONMENT=Development
cd GraphML.API/bin/Debug/net5.0
./GraphML.API
- open Swagger UI
- start Apache ActiveMQ
- start Redis
- run IdentityServer4
export ASPNETCORE_ENVIRONMENT=Development
cd IdentityServerAspNetIdentity/bin/Debug/net5.0
./IdentityServerAspNetIdentity
- open IdentityServer4 Login
- open IdentityServer4 Discovery Document
- run Analysis Server
export ASPNETCORE_ENVIRONMENT=Development
cd GraphML.API/bin/Debug/net5.0
./GraphML.Analysis.Server
- open Apache ActiveMQ management console
- start Redis Commander
redis-commander --port 8080
Front End/s
GraphML.UI.Web
export ASPNETCORE_ENVIRONMENT=Development
cd GraphML.UI.Web/bin/Debug/net5.0
./GraphML.UI.Web
Backend API
Variable | Description | Example Value |
---|---|---|
ASPNETCORE_ENVIRONMENT | ASP.NET Core runtime environment | Production , Development , Test |
API_URI | API server URL used by GraphML.API.Server to retrieve data |
|
DATASTORE_CONNECTION | SqLite | |
DATASTORE_CONNECTION_TYPE | SqLite | |
DATASTORE_CONNECTION_STRING | Data Source=|DataDirectory|Data/GraphML.sqlite3;Foreign Keys=True; | |
LOG_CONNECTION_STRING | .NET connection string for database logging | |
RESULT_DATASTORE | Redis URL | localhost:6379 |
MESSAGE_QUEUE_URL | Apache ActiveMQ URL | activemq:tcp://localhost:61616 |
MESSAGE_QUEUE_NAME | GraphML | |
MESSAGE_QUEUE_POLL_INTERVAL_S | time in seconds between checking for new analysis jobs | 5 |
MESSAGE_QUEUE_USE_THREADS | False |
Components
The following components are used to analyse a graph:
Description
Base
Abstract entities which are ancestors for other GraphML entities.- Item
- Ultimate ancestor of all GraphML objects.
- Models something which can be persisted.
- Every item ultimately belongs to an Organisation
- OwnedItem
- Something which has an immediate owner, other than an Organisation
Containers
Entities which serve as a holding place for other entities.- Organisation
- Typically a company, organisation or other legal entity in which people work together.
- police force
- GCHQ
- FBI
- military
- bank
- Used to isolate information between different Organisations
- Id and OrganisationId must be the same
- Typically a company, organisation or other legal entity in which people work together.
- RepositoryManager
- A means to group a subset of Repository in an Organisation in some logical manner.
- For example, repositories could be grouped at a departmental level eg 'Financial Fraud' or 'Credit Control'.
- ItemAttributeDefinition are held at RepositoryManager level so they can be shared across Repository.
- Repository
- A complete collection of Node and Edge representing an area of interest.
- Graph
- A subset of Nodes and Edges from a Repository which have been extracted for separate analysis.
- A Graph may be directed; in contrast to a Repository, which has no notion of direction.
- Chart
- A 2D pictorial representation of a subset of Nodes and Edges from a Graph.
- Generally used to visualise analysis results.
- Default implementation is a Diagram.
- Layout algorithms can be applied to change the position of Nodes and Edges.
- Timeline
- A 2D pictorial representation of a subset of Nodes and Edges from a Graph.
- Generally used to visualise temporal (time based) data.
- Default implementation is a gantt chart.
Graph
- RepositoryItem
- Something which is in a Repository, either a Node or an Edge
- Node
- A vertex representing something of interest.
- A Node may be connected to zero or one other Nodes by an Edge
- A Node may have properties associated with it via an NodeItemAttribute
- Edge
- A link connecting two Node.
- An Edge may have a 'weight/s' (or other properties) associated with it via an EdgeItemAttribute
- An Edge is not directed 'per se'; this is set on the Graph
- GraphItem
- Something which is in a Graph, either a GraphNode or a GraphEdge
- GraphNode
- A Node which appears in a Graph.
- Name may be different to that of underlying Node
- GraphEdge
- An Edge which appears in a Graph.
- Name may be different to that of underlying Edge
- ChartItem
- Something which is in a Chart, either a ChartNode or a ChartEdge
- ChartNode
- A Node which appears in a Chart.
- Name may be different to that of underlying Node
- ChartEdge
- An Edge which appears in a Chart.
- Name may be different to that of underlying Edge
Attributes
ItemAttributeDefinition are held at RepositoryManager level so they can be shared across Repository.- ItemAttributeDefinition
- Defines shape (name and data type) of information in an ItemAttribute
- RepositoryItemAttributeDefinition
- Defines shape of information in a RepositoryItemAttribute
- GraphItemAttributeDefinition
- Defines shape of information in a GraphItemAttribute
- NodeItemAttributeDefinition
- Defines shape of information in a NodeItemAttribute
- EdgeItemAttributeDefinition
- Defines shape of information in an EdgeItemAttribute
- ItemAttribute
- Additional information attached to an Item
- RepositoryItemAttribute
- Additional information attached to a Repository
- GraphItemAttribute
- Additional information attached to a Graph
- NodeItemAttribute
- Additional information attached to a Node
- EdgeItemAttribute
- Additional information attached to an Edge
- Currently supported data types:
- string
- bool
- int
- double
- DateTime (UTC)
- DateInterval (UTC)
Support
- Contact
- A person identified by their email address.
- The email address (Name) is used to link authentication (IdentityServer4) to Role.
- Role
- The function performed by a Contact in the context of GraphML.
- There are several, predefined functions in Roles
- A Contact may have one or more Roles
- Roles
- User roles within GraphML
Roles and Users
- enable
Development
mode by setting env var:
export ASPNETCORE_ENVIRONMENT=Development
- authentication (who you are) is handled by IdentityServer
- authorisation (what you can do) is handled by GraphML, based on an email claim
- security is role based, with the following predefined roles:
Role | Description |
---|---|
User | An entity using GraphML |
UserAdmin | An entity managing a subset of data within GraphML, typically data belonging to a single organisation |
Admin | An entity managing all data within GraphML |
- the above roles are owned by System organisation
- SwaggerUI is only enabled in
Development
mode - SwaggerUI authentication will redirect to a login screen in IdentityServer
- GraphML and IdentityServer4 have some test users:
UserName | Password | Roles | Notes | |
---|---|---|---|---|
alice |
Pass123$ |
DrKool@KoolOrganisation.org | Admin | system wide admin |
bob |
Pass123$ |
BobSmith@email.com | none | known to IdentityServer4 but not GraphML |
carol |
Pass123$ |
carol@KoolOrganisation.org | UserAdmin | |
dave |
Pass123$ |
dave@KoolOrganisation.org | User | |
eric |
Pass123$ |
eric@GraphML.com | User |
How to add a new user
- add user to GraphML
GraphML:./GraphML.Datastore.Database/Data/Import.sql
- import into database
- add user to IdentityServer4
GraphML:./IdentityServerAspNetIdentity/SeedData.cs
- import into database
./IdentityServerAspNetIdentity.exe /seed
A reference browser based GUI is provided. This is written in Blazor and uses the following components:
- Blazor.ContextMenu
- Blazorise
- Blazorise.Bootstrap
- Blazorise.Icons.FontAwesome
- BlazorPro.Spinkit
- BlazorRazor
- BlazorTable
- GraphShape (graph layout)
- MatBlazor
- Z.Blazor.Diagrams (graph visualisation)
- ChartJs.Blazor (timeline visualisation)
At this stage, printing is limited to using the web browser's native printing. Export to PDF (or other formats) is not supported by the current diagramming component (Z.Blazor.Diagrams) but may be possible with other components eg Syncfusion or Blazor.Diagrams. Obviously, replacing such a fundamental component is risky and difficult.
Icons should be 32x32 pixels in size and are resized to this for display.
There are many sources of free or low cost icons on the internet eg:
Real world, large datasets can be obtained from:
At this stage, multi-tenancy isolation is implemented in GraphML.Logic:
- GraphML.Logic.Validators
- does the initial call even make sense
- only allow calls on items which caller is allowed to access
- GraphML.Logic.Filters
- only return items relevant to the caller
- only return items caller is allowed to see
Future work will change to a database-per-client type of isolation which is better suited to high security environments. This will make validators and filters redundnant as all calls are guaranteed to come from the same organisation. In turn, this will make the Organisation entity redundant.
Alternatively, a dedicated deployment per organisation would achieve a similar effect at the expense of managing each deployment.
- How to reattach links
Deleting link on portless node leaves dangling link - inconsistent with ported node- No way to interactively create links between portless nodes
- Enhancement request: separate links between same pairs of nodes
Enhancement request: Would like Diagram.MouseDoubleClick eventMoving ported nodes programmatically results in links rendered incorrectl- Enhancement request: Export to PDF
- NavigatorWidget not work with empty Diagram
Port Allocations
Service | Port | Notes |
---|---|---|
IdentityServerAspnetIdentity | 44387 | |
GraphML.API | 5001 | |
GraphML.UI.Web | 5002 | |
Apache ActiveMQ | 61616 | |
Apache ActiveMQ console | 8161 | |
Redis | 6379 | |
Redis Commander | 8080 | default port 8081 |
Microsoft SQL Server | 1443 | |
MariaDB | 3306 | |
PostgreSQL | 5432 |
Apache ActiveMQ
You can monitor ActiveMQ using the Web Console by pointing your browser at http://localhost:8161/admin .
From ActiveMQ 5.8 onwards the web apps is secured out of the box.
The default username and password is admin/admin
.
There seems to be a problem accessing the Web Console from Google Chrome, so it is recommended to use Firefox (or Microsoft Edge).
Redis
Recommended method is to use a Docker container:
docker pull redis
docker run -p 6379:6379 redis
Alternate method is to install and run Redis on WSL:
https://redislabs.com/blog/redis-on-windows-10/
sudo apt install redis-server
sudo service redis-server status
sudo service redis-server start
sudo service redis-server stop
npm install -g redis-commander
redis-commander --port 8080
open Redis Commander management console
Pro Tip : to reset the database, use flushdb
This document is best view in Google Chrome with the Markdown Viewer extension. Remember to enable access to file urls in the settings.
- update ranked shortest path to support temporal analysis
- going forward in time eg for financial transactions or phone calls
- support
DateTimeInterval
- should be able to transform graph such that links which go backwards in time have infinite weight
- provide UI to select time attribute
- really improve timeline visualisation
- probably best to invest in Syncfusion diagramming component (!)
- improve printing/export
- probably best to invest in Syncfusion diagramming component (!)
- support AMQP
- support other datastores
- unit tests