This documentation provides critical information needed to help you write ODL Applications/Projects using Infrautils, which offers various generic utilities and infrastructure for ease of application development.
Contents:
Starting from Carbon, InfraUtils project uses RST format Design Specification document for all new features. These specifications are perfect way to understand various InfraUtils features.
Contents:
Table of Contents
[link to gerrit patch]
Brief introduction of the feature.
Detailed description of the problem being solved by this feature
Details of the proposed change.
This should detail any changes to yang models.
Any configuration parameters being added/deprecated for this feature? What will be defaults for these? How will it impact existing deployments?
Note that outright deletion/modification of existing configuration is not allowed due to backward compatibility. They can only be deprecated and deleted in later release(s).
This should capture how clustering will be supported. This can include but not limited to use of CDTCL, EOS, Cluster Singleton etc.
This should capture impact from/to different infra components like MDSAL Datastore, karaf, AAA etc.
Document any security related issues impacted by this feature.
What are the potential scale and performance impacts of this change? Does it help improve scale and performance or make it worse?
What release is this feature targeted for?
Alternatives considered and why they were not selected.
How will end user use this feature? Primary focus here is how this feature will be used in an actual deployment.
For most InfraUtils features users will be other projects but this should still capture any user visible CLI/API etc. e.g. Counters
This section will be primary input for Test and Documentation teams. Along with above this should also capture REST API and CLI.
odl-infrautils-all
Identify existing karaf feature to which this change applies and/or new karaf features being introduced. These can be user facing features which are added to integration/distribution or internal features to be used by other projects.
Who is implementing this feature? In case of multiple authors, designate a primary assigne and other contributors.
Break up work into individual items. This should be a checklist on Trello card for this feature. Give link to trello card or duplicate it.
Any dependencies being added/removed? Dependencies here refers to internal [other ODL projects] as well as external [OVS, karaf, JDK etc.] This should also capture specific versions if any of these dependencies. e.g. OVS version, Linux kernel version, JDK etc.
This should also capture impacts on existing project that depend on InfraUtils. Following projects currently depend on Infrautils: * Netvirt * GENIUS
What is impact on documentation for this change? If documentation change is needed call out one of the <contributors> who will work with Project Documentation Lead to get the changes done.
Don’t repeat details already discussed but do reference and call them out.
Add any useful references. Some examples:
[1] OpenDaylight Documentation Guide
[2] https://specs.openstack.org/openstack/nova-specs/specs/kilo/template.html
Note
This template was derived from [2], and has been modified to support our project.
This work is licensed under a Creative Commons Attribution 3.0 Unported License. http://creativecommons.org/licenses/by/3.0/legalcode
Table of Contents
https://git.opendaylight.org/gerrit/#/q/topic:JC
Job Coordinator is a framework for executing jobs in sequential/parallel based on their job-keys. One such job,to give an example, can be for MD-SAL config/operational datastore updates.
The concept of datastore jobcordinator was derived from the following pattern seen in many ODL project implementations :
This feature will support following use cases:
The proposed feature adds a new module in infrautils called “jobcoordinator”, which will have the following functionalities:
N/A
Applications can define their own worker threads for their job. A job is defined as a piece of code that can be independently executed.
Applications should define a rollback worker, which will have the code to be executed in case the main job fails permanently. In usual scenarios, this will be the code to clean up all partially completed transactions by the main worker.
Applications should carefully choose the job-key for their job worker. All jobs based on the same job-key will be executed sequentially, and all jobs on different keys will be executed parallelly depending on the available threadpool size.
Applications can enqueue their job worker to JC framework for execution.JC has a hash structure to handle the execution of the tasks sequentially/parallelly. Whenever a job is enqueued, JC creates a Job Entry for the particular job. A Job Entry is characterized by - job-key, the main worker, the rollback worker and the number of retries. This JobEntry will be added to a JobQueue, which inturn is part of a JobQueueMap.
There is a JobQueueHandler task which runs periodically, which will poll each of the JobQueues to execute the main task of the corresponding JobEntry. Within a JobQueue, execution will be synchronized.
The list of listenable futures for the transactions from the application main worker will be available to JC, and if at all the transaction fails, the main worker will be retried the ‘max-retries’ number of times which is application specified. If all the retries fail, JC will bail out and the rollback worker will be executed.
This feature is aiming at improving the scale and performance of applications by providing the cabability to execute their functions parallelly wherever it can be done.
Carbon.
JC synchronization is not currently clusterwide.
N/A
This feature doesn’t add any new karaf feature.
JobCoordinator provides the below APIs which can be used by other applications:
void enqueueJob(String key, Callable<List<ListenableFuture<Void>>> mainWorker).
void enqueueJob(String key, Callable<List<ListenableFuture<Void>>> mainWorker, RollbackCallable rollbackWorker).
void enqueueJob(String key, Callable<List<ListenableFuture<Void>>> mainWorker, int maxRetries).
void enqueueJob(String key, Callable<List<ListenableFuture<Void>>> mainWorker, RollbackCallable rollbackWorker,
int maxRetries).
key is the JobKey for synchronization, mainWorker will be the actual Job Task, maxRetries is the number of times a Job will be retried if the mainWorker results in ERROR, rollbackWorker is the Task to be executed if the Job fails with any ERROR maxRetries times.
Appropriate UTs will be added for the new code coming in once framework is in place.
This will require changes to Developer Guide.
Developer Guide can capture the new set of APIs added by JobCoordinator as mentioned in API section.
Table of Contents
https://git.opendaylight.org/gerrit/#/q/topic:s-n-d
Status reporting is an important part of any system. This document explores and describes various implementation options for achieving the feature.
Today ODL does not have a centralized mechanism to do status and diagnostics of the various service modules, and have predictable system initialization. This leads to a lot of confusions on when a particular service should start acting upon the various incoming system events, because in many cases(like restarts) services end up doing premature service handling.
The feature aims at developing a status and diagnostics framework for ODL, which can :
This feature will support following use cases:
The proposed feature adds a new module in infrautils called “diagstatus”, which allows CLI or alternative suitable interface to query the status of the services running in context of the controller (interface like Openflow, OVSDB, ELAN,ITM, IFM, Datastore etc.). This also allows individual services to push status-changes to this centralized module via suitable API-based notification. There shall be a generic set of events which application can report to the central monitoring module/service which shall be used by the service to update the latest/current status of the services.
Status model object encapsulating the metadata of status such as:
Applications must invoke status-reporting APIs as required across the lifecycle of the services in start-up, operational and graceful shutdown phases In order to emulate a simpler state-machine, we can have services report following statuses * STARTING – at the start of onSessionInitiated() on instrumented service * OPERATIONAL – at the end of onSessionInitiated() on instrumented service * ERROR – if any exception is caught during the service bring-up of if the service goes into an ERROR state dynamically * REGISTER – on successful registration of instrumented service * UNREGISTER – when a service does unregister from diagstatus on its own
N/A
Whenever the new service comes up, the service provider should register new service in service registry.
Application can report their status using diagstatus APIs
Whenever applications/CLI try to fetch the service status, diagstatus module will query the status through the respective OsgiService implementations exposed by each service,and an aggregated result is provided as response.
N/A as it is a new feature which does not impact any current functionality.
Carbon.
The initial feature will not have the health check functionality. The initial feature will not have integration to infrautils counter framework for dispalying diag-counters.
N/A
This feature adds a new karaf feature, which is odl-infrautils-diagstatus.
Following are the service APIs which must be supported by the Framework :
Following CLIs will be supported as part of this feature:
Following osgi services will be supported as part of this feature:
This is a new module and requires the below libraries:
This change is backwards compatible, so no impact on dependent projects. Projects can choose to start using this when they want.
Following projects currently depend on InfraUtils:
Appropriate UTs will be added for the new code coming in once framework is in place.
Since Component Style unit tests will be added for the feature, no need for ITs
This will require changes to User Guide and Developer Guide.
User Guide will need to add information on how to use status-and-diag APIs and CLIs
Developer Guide will need to capture how to use the APIs of status-and-diag module to derive service specific actions. Also, the documentation needs to capture how services can expose their status via Mbean and integrate the same to status-and-diag module
Table of Contents
https://git.opendaylight.org/gerrit/#/q/topic:bug/8300 https://www.youtube.com/watch?v=h4HOSRN2aFc
Infrautils Caches provide a Cache of keys to values. The implementation of Infrautils Cache API is, typically, backed by established cache frameworks,such as Ehcache, Infinispan, Guava’s, Caffeine, …, imcache, cache2k, … etc.
Caches are not Maps!. Differences include that a Map persists all elements that are added to it until they are explicitly removed. A Cache on the other hand is generally configured to evict entries automatically, in order to constrain its memory footprint, based on some policy. Another notable difference, enforced by this caching API, is that caches should not be thought of as data structures that you put something in somewhere in your code to get it out of somewhere else. Instead, a Cache is “just a façade” to a CacheFunction’s get. This design enforces proper encapsulation, and helps you not to screw up the content of your cache (like you easily can, and usually do, when you use a Map as a cache).
This feature will support following use cases:
The proposed feature adds a new module in infrautils called “caches”, which will have the following functionalities:
N/A
Applications can define their own Cache, with specified CacheFunction and Policies. Caches can be configured to set the maximum entries and also to enable stats.
Cache is “just a façade” to a CacheFunction’s get(). When the user defines a CacheFunction for his cache, that will be executed whenever a get() is executed on the cache.
Anchor refers to instance of the class “containing” this Cache. It is used by CacheManagers to display to end-user.
Cache id is a short ID for this cache, and description will be a one line human readable description for the cache.
Cache Eviction Policy is based on the number of entries the cache can hold, which will be set during the cache creation time.
This feature is aiming at improving the scale and performance of applications by helping to define a CacheFunction for heavy operations.
Carbon.
Cache is currently neither distributed (cluster wide) nor transactional.”
N/A
odl-infrautils-caches odl-infrautils-caches-sample
Caches provides the below APIs which can be used by other applications:
CacheProvider APIs
<K, V> Cache<K, V> newCache(CacheConfig<K, V> cacheConfig, CachePolicy initialPolicy);
<K, V> Cache<K, V> newCache(CacheConfig<K, V> cacheConfig);
<K, V, E extends Exception> CheckedCache<K, V, E> newCheckedCache(
CheckedCacheConfig<K, V, E> cacheConfig, CachePolicy initialPolicy);
<K, V, E extends Exception> CheckedCache<K, V, E> newCheckedCache(CheckedCacheConfig<K, V, E> cacheConfig);
CacheManager APIs
BaseCacheConfig getConfig();
CacheStats getStats();
CachePolicy getPolicy();
void setPolicy(CachePolicy newPolicy);
void evictAll();
Appropriate UTs will be added for the new code coming in once framework is in place.
This will require changes to Developer Guide.
Developer Guide can capture the new set of APIs added by Caches as mentioned in API section.
Table of Contents
This project offers technical utilities and infrastructures for other projects to use.
The conference presentation slides linked to in the references section at the end give a good overview of the project.
Check out the JavaDoc on https://javadocs.opendaylight.org/org.opendaylight.infrautils/fluorine/.
Bunch of small (non test related) low level general utility classes à la Apache (Lang) Commons or Guava and similar incl. utils.concurrent:
JobCoordinator service which enables executing jobs in a parallel/sequential fashion based on their keys.
Infrastructure to detect when Karaf is ready.
The implementation internally uses the same Karaf API that e.g. the standard “diag” Karaf CLI command uses. This checks both if all OSGi bundles have started as well if their blueprint initialization has been successfully fully completed.
It builds on top of the bundles-test-lib from odlparent, which is what we run as SingleFeatureTest (SFT) during all of our builds to ensure that all projects’ features can be installed without broken bundles.
The infrautils.diagstatus modules builds on top of this infrautils.ready.
What infrautils.ready adds on top of the underlying raw Karaf API is operator friendly logging, a convenient API and a correctly implemented polling loop in a background thread with SystemReadyListener registrations and notifications, instead of ODL applications re-implementing this. The infrautils.ready project intentionally API isolates consumers from the Karaf API. We encourage all ODL projects to use this infrautils.ready API instead of trying to reinvent the wheel and directly depending on the Karaf API, so that application code could be used outside of OSGi, in environment such as unit and component tests, or something such as honeycomb.
Applications can use this SystemReadyMonitor registerListener(SystemReadyListener) in a constructor to register a listener for and get notified when all bundles are “ready” in the technical sense (have been started in the OSGi sense and have completed their blueprint initialization), and could on that event do any further initialization it had to delay in the original blueprint initialization.
This cannot directly be used to express functional dependencies BETWEEN bundles (because that would deadlock infrautils.ready; it would stay in BOOTING forever and never reach SystemState ACTIVE). The natural way to make one bundle await another is to use Blueprint OSGi service dependency. If there is no technical service dependency but only a logical functional one, then in infrautils.ready.order there is a convenience sugar utility to publish “marker” FunctionalityReady interfaces to the OSGi service registry; unlike real services, these have no implementing code, but another bundle could depend on one to enforce start up order one (using regular bluepring <reference> in XML or @OsgiService annotation).
A known limitation of the current implementation of infrautils.ready is that its “wait until ready” loop runs only once, after installation of infrautils.ready (by boot feature usage, or initial single line feature:install). So SystemState will go from BOOTING to ACTIVE or FAILURE, once. So if you do more feature:install after a time gap, there won’t be any further state change notification; the currently implementation won’t “go back” from ACTIVE to BOOTING. (It would be absolutely possible to extend SystemReadyListener onSystemBootReady() with an onSystemIsChanging() and onSystemReadyAgain(), but the original author has no need for this; as “hot” installing additional ODL application features during operational uptime was not a real world requirement to the original author. If this is important to you, then your contributions for extending this would certainly be welcome.)
infrautils’ ready, like other infrautils APIs, is available as a separate Karaf feature. Downstream projects using infrautils.ready will therefore NOT pull in other bundles for other infrautils functionalities.
See https://bugs.opendaylight.org/show_bug.cgi?id=8438 and https://git.opendaylight.org/gerrit/#/c/56898/
Used for non-regression self-testing of features in this project (and available to others).
See https://www.youtube.com/watch?v=h4HOSRN2aFc and play with the example in infrautils/caches/sample installed by odl-infrautils-caches-sample; history in https://git.opendaylight.org/gerrit/#/c/48920/ and https://bugs.opendaylight.org/show_bug.cgi?id=8300.
To be documented.
infrautils.metrics offers a simple back-end neutral API for all ODL applications to report technical as well as functional metrics.
There are different implementations of this API allowing operators to exploit metrics in the usual ways - aggregate, query, alerts, etc.
The odl-infrautils-metrics Karaf feature includes the API and the local Dropwizard implementation.
Application code uses the org.opendaylight.infrautils.metrics.MetricProvider API, typically looked up from the OSGi service registry using e.g. Blueprint annotations @Inject @OsgiService, to register new Meters (to “tick/mark events” and measure their rate), Counters (for things that go up and down again), and Timers (to stop watch durations). Support for “Gauges” is to be added; contributions welcome.
Each metric can be labeled, possibly along more than one dimension.
The org.opendaylight.infrautils.metrics.testimpl.TestMetricProviderImpl is a suitable implementation of the MetricProvider for tests.
Based on Dropwizard Metrics (by Coda Hale at Yammer), see http://metrics.dropwizard.io, exposes metrics to JMX and can regularly dump stats into simple local files; background slide https://codahale.com/codeconf-2011-04-09-metrics-metrics-everywhere.pdf
This implementation is “embedded” and requires no additional external systems.
It is configured via the local configuration file at etc/org.opendaylight.infrautils.metrics.cfg.
This includes a threads deadlock detection and maximum number of threads warning feature.
Implementation based on Linux Foundation Cloud Native Computing Foundation Prometheus, see https://prometheus.io
This implementation exposes metrics by HTTP on /metrics/prometheus from the local ODL to an external Prometheus set up to scrape that.
This presentation given at the OpenDaylight Fluorine Developer Design Forum in March 2018 at ONS in LA gives a good overview about the infrautils.metrics.prometheus implementation.
This implementation requires operators to separatly install Prometheus, which is not a Java OSGi application that can be feature:install into Karaf, but an external application (via Docker, RPM, tar.gz etc.). Prometheus is then configured with the URL of ODL nodes, and “scrapes” metrics from ODL in configurable regular intervals. Prometheus is extensibly configurable for typical metrics use cases, including alerting, and has existing integrations with other related systems.
The odl-infrautils-metrics-prometheus Karaf feature install this. It has to be installed by feature:install or featuresBoot, BEFORE any ODL application feature which depends on the odl-infrautils-metrics feature (similarly to e.g. odl-mdsal-trace)