BOSA.be
November 2020 – AFC Brussels
Bart Hanssens
BOSA DG Digital Transformation
Open Source &
Open Data
Proposed agenda
 Introduction
 Open Source
 Licenses, business models …
 Break
 Open Data
 Challenges, community, …
 Q & A
FPS BOSA DG Digital Transformation
 Formerly: Fedict
 Providing services to other federal public services
 Secure exchange of data between administrations
 Federal Authentication Service (FAS)
 Web hosting platform
 https://dt.bosa.be/en
Setting the scene
 Workshop expectations ?
 Who is already familiar with open source ?
 As a user / as a developer
 Who is already familiar with open data ?
 As a user / as a developer
Open Source
What is Open Source ?
 Open source / FLOSS
 Versus proprietary
 Open projects: more than just free code
 Questions / support ?
 Dealing with 3rd party contributions
 “Open” development: issue tracker, roadmap…
Picking “healthy” projects
 Is the code itself actively maintained ?
 E.g. releases, changes, code commits
 How active is the community (if there is one) ?
 Activity in mailing list / issue tracker
 How many different contributors ?
 (References to users / other projects) ?
Open source projects are not that different
 Which programming languages / frameworks ?
 Almost any (Java, .NET, Swift, Python, NodeJS, …)
 Open source still needs
 Documentation
 Tests
 Release notes
Open source licenses
Differences and similarities
 Not all licenses can be combined !
 What they have in common
 No guarantees, non-exclusive, (mostly) irrevocable
 Right to distribute
 Can be used commercially
 Important differences between licenses
 Patent usage protection or not
 Distribution of source code or not
 License(s) of software combinations
Types of licenses
 “Strong” copy-left licenses
 GPL, EUPL, Affero GPL, …
 “Weak” copy-left
 LGPL, MPL, …
 Permissive licenses
 BSD, Apache, MIT, EPL …
 https://choosealicense.com/appendix/
 https://opensource.org/licenses
Strong copy-left example: GNU/Linux
 GPL kernel + GNU “userland” applications
 Lots of additional kernel features developed
 Different filesystems, firewall, network stack, …
 Proprietary drivers possible, but distributed separately
Weak copy-left example: eID middleware
 Belgian eID card middleware
 https://github.com/fedict/eid-mw
 Lesser GPL (LGPL)
 Can be used by proprietary applications
Permissive example: MacOS / iOS
 Mix of proprietary and open source parts
 https://developer.apple.com/opensource/
 Darwin “core OS”
 BSD licensed parts
 Apple open source license ASPL
 Other parts and applications
 Some Apache licensed parts
Code sharing platforms and communities
Code sharing / versioning
 Focus on sharing code, packaging
 Often include issue tracking, build services
 Free and commercial plans
 Github, Codeberg, SourceForge, …
Foundations and communities
 Focus on process, consistency, license compatibility
 Often NPO with elected board members
 Use or provide own code sharing platform
 Apache Foundation, Eclipse Foundation, Drupal.org…
Communication is key
 Open source projects are international
 Users may not be native (English) speakers
 Different cultures: direct vs indirect communication
 Often driven by volunteers
 They are not (free) employees
Example: RDF4J project board on github
Example: Eclipse Foundation
Open source business models
Dual-licensing
 Multiple licenses for the same code
 Only the copyright holder can do this
 E.g. GPL for use in other open source projects…
 … and commercial license for “closed” source
 Examples:
 MySQL Database: GPL and Oracle license
 iText PDF library: AGPL and proprietary license
Additional services or modules
 Training, extra components, integrations …
 Odoo, Alfresco, Pentaho, …
 Providing “stable” combinations, long-term support
 RedHat Enterprise Linux, Oracle JDK, …
Cost savings / synergies
 Sharing cost of development and maintenance
 Companies contributing to tools they use themselves
 Android, Python, Chromium, Eclipse IDE …
 When software is not the core business
 Selling hardware or (more profitable) services
 Network drivers, Kubernetes …
For marketing, fun and/or for the greater good
 Software developed by/for universities, governments
 Paid with taxpayers’ money
 E.g BSD, Decidim, Accumulo …
 Community projects
 For fun and glory
 E.g. GIMP, PostgreSQL, VLC, …
 Some “fun” projects became quite big and profitable
 E.g. Linux, PHP, Drupal, …
Miscellaneous links on open source
Links
 EC Open Source Software Strategy
 https://ec.europa.eu/info/departments/informatics/ope
n-source-software-strategy_en
 “Blue hats” community (French gov community)
 https://www.modernisation.gouv.fr/le-hub-des-
communautes/blue-hats
 Annual FOSDEM conference
 https://fosdem.org
Open Data
What is open data ?
 Free, easy to reuse, non-sensitive data
 No personal data (GDPR !)
 Download files or webservice / API
 Who publishes open data ?
 Governments (local /regional / federal)
 Companies (voluntary basis)
 Citizens / crowd-sourcing projects (voluntary basis)
Legal framework
 EU Public Sector Information / Open Data Directive
 https://ec.europa.eu/digital-single-market/en/european-
legislation-reuse-public-sector-information
 Transposed into federal law and regional decrees
What kind of data ? Almost any !
 A lot of statistics
 Population, consumer price index, average income …
 Transport and mobility
 Train delays, parking spots, …
 Environment
 Air quality, biodiversity, …
 Geospatial data
 Satellite data, maps, …
Formats
 No single format for each and every type of data
 Depends on application / systems / conventions
 Preference for open, structured file formats
 CSV, GTFS, (Geo)JSON, KML, ODS, sqlite, XLSX, XML …
 (RDF / Linked Data)
 Known / documented APIs
 OGC GIS, OpenAPI/Swagger, SPARQL …
 Availability and meaning often more important
Open data licenses
Quick tour of licenses
 Well-known licenses
 Various Creative Commons, Open Data Commons
 https://creativecommons.org/about/cclicenses/
 https://opendatacommons.org/licenses/
 Legal situation in Belgium
 Status: it’s complicated (federated state)
 Quite a few slightly different custom licenses …
 … many of them are very similar to CC-licenses
Differences and similarities
 Not all licenses can be combined !
 What they have in common
 No guarantees, non-exclusive, (mostly) irrevocable
 Right to distribute
 Important differences between licenses
 Can be used commercially or not
 Attribution required or not
 License of combined datasets
Open data business models
Additional services and data
 Combining and transforming data
 Providing Service Level Agreements
 Additional services
 E.g. user-friendly search, visualizations, reports…
 Examples
 Law: https://lex.be , https://jura.kluwer.be
 Companies: https://opencorporates.com
Example: MapTiler with OpenStreetMap data
Where to find open data ?
 Websites and (open data) portals
 https://data.gov.be
 If you can’t find it… just ask !
 opendata@belgium.be
 @BartHanssens
 #data.gov.be:matrix.org
Open data portals
Data.gov.be and the Open Data Task Force
 Open Data Task Force: FPS BOSA DG DT + DAV/ASA
 Portal is “by-product” of full (meta)data export
 See also https://github.com/fedict/dcat
 “No Wrong Door” principle
 Forward questions to Regions / Cities if needed
Combining open source and open data
Open Summer of Code
 Organized by OpenKnowledge Belgium
 + partners from government and private sector
 https://summerofcode.be/
 Students get paid to build open source projects
 Often projects which also use open data
 Mostly IT, but also business, design, marketing, …
 3 x 4 day in July
 With coaches, presentations on soft skills, …
Example: BeST Addresses
 Addresses + geo location from the 3 Regions
 Published and updated weekly by BOSA DG DT
 https://opendata.bosa.be/
 2019 oSoC project
 Python tools developed by students
 Data also available on https://openaddresses.io
 https://github.com/oSoc19/best
Open data communities
 OpenKnowledge Belgium
 Local “chapter” of OpenKnowledge
 https://be.okfn.org
 OSGeo Belgium
 Open data and open source
 https://www.osgeo.org/local-chapters/osgeo-belgium/
 Open Justice
 https://openjustice.be
Other community initiatives
Data-driven projects
 Mozilla CommonVoice
 https://commonvoice.mozilla.org/en/about
 Telraam traffic counters
 https://telraam.net/en
 Global low-power data transfer / IoT
 https://www.thethingsnetwork.org
Share your open data stories !
 Are you working on a thesis / paper on open data ?
 Did you create a stunning app / service ?
 Are you organizing an open data-driven event ?
 Let us know, so we can:
 Tweet and make some fuzz
 Promote your work
 Convince more organizations to open data
Questions ?
BOSA.be
@BartHanssens
https://data.gov.be
opendata@belgium.be
#data.gov.be:matrix.org
Thank you !

Open Source and Open Data

  • 1.
    BOSA.be November 2020 –AFC Brussels Bart Hanssens BOSA DG Digital Transformation Open Source & Open Data
  • 2.
    Proposed agenda  Introduction Open Source  Licenses, business models …  Break  Open Data  Challenges, community, …  Q & A
  • 3.
    FPS BOSA DGDigital Transformation  Formerly: Fedict  Providing services to other federal public services  Secure exchange of data between administrations  Federal Authentication Service (FAS)  Web hosting platform  https://dt.bosa.be/en
  • 4.
    Setting the scene Workshop expectations ?  Who is already familiar with open source ?  As a user / as a developer  Who is already familiar with open data ?  As a user / as a developer
  • 5.
  • 6.
    What is OpenSource ?  Open source / FLOSS  Versus proprietary  Open projects: more than just free code  Questions / support ?  Dealing with 3rd party contributions  “Open” development: issue tracker, roadmap…
  • 7.
    Picking “healthy” projects Is the code itself actively maintained ?  E.g. releases, changes, code commits  How active is the community (if there is one) ?  Activity in mailing list / issue tracker  How many different contributors ?  (References to users / other projects) ?
  • 8.
    Open source projectsare not that different  Which programming languages / frameworks ?  Almost any (Java, .NET, Swift, Python, NodeJS, …)  Open source still needs  Documentation  Tests  Release notes
  • 9.
  • 10.
    Differences and similarities Not all licenses can be combined !  What they have in common  No guarantees, non-exclusive, (mostly) irrevocable  Right to distribute  Can be used commercially  Important differences between licenses  Patent usage protection or not  Distribution of source code or not  License(s) of software combinations
  • 11.
    Types of licenses “Strong” copy-left licenses  GPL, EUPL, Affero GPL, …  “Weak” copy-left  LGPL, MPL, …  Permissive licenses  BSD, Apache, MIT, EPL …  https://choosealicense.com/appendix/  https://opensource.org/licenses
  • 12.
    Strong copy-left example:GNU/Linux  GPL kernel + GNU “userland” applications  Lots of additional kernel features developed  Different filesystems, firewall, network stack, …  Proprietary drivers possible, but distributed separately
  • 13.
    Weak copy-left example:eID middleware  Belgian eID card middleware  https://github.com/fedict/eid-mw  Lesser GPL (LGPL)  Can be used by proprietary applications
  • 14.
    Permissive example: MacOS/ iOS  Mix of proprietary and open source parts  https://developer.apple.com/opensource/  Darwin “core OS”  BSD licensed parts  Apple open source license ASPL  Other parts and applications  Some Apache licensed parts
  • 15.
    Code sharing platformsand communities
  • 16.
    Code sharing /versioning  Focus on sharing code, packaging  Often include issue tracking, build services  Free and commercial plans  Github, Codeberg, SourceForge, …
  • 17.
    Foundations and communities Focus on process, consistency, license compatibility  Often NPO with elected board members  Use or provide own code sharing platform  Apache Foundation, Eclipse Foundation, Drupal.org…
  • 18.
    Communication is key Open source projects are international  Users may not be native (English) speakers  Different cultures: direct vs indirect communication  Often driven by volunteers  They are not (free) employees
  • 19.
    Example: RDF4J projectboard on github
  • 20.
  • 21.
  • 22.
    Dual-licensing  Multiple licensesfor the same code  Only the copyright holder can do this  E.g. GPL for use in other open source projects…  … and commercial license for “closed” source  Examples:  MySQL Database: GPL and Oracle license  iText PDF library: AGPL and proprietary license
  • 23.
    Additional services ormodules  Training, extra components, integrations …  Odoo, Alfresco, Pentaho, …  Providing “stable” combinations, long-term support  RedHat Enterprise Linux, Oracle JDK, …
  • 24.
    Cost savings /synergies  Sharing cost of development and maintenance  Companies contributing to tools they use themselves  Android, Python, Chromium, Eclipse IDE …  When software is not the core business  Selling hardware or (more profitable) services  Network drivers, Kubernetes …
  • 25.
    For marketing, funand/or for the greater good  Software developed by/for universities, governments  Paid with taxpayers’ money  E.g BSD, Decidim, Accumulo …  Community projects  For fun and glory  E.g. GIMP, PostgreSQL, VLC, …  Some “fun” projects became quite big and profitable  E.g. Linux, PHP, Drupal, …
  • 26.
  • 27.
    Links  EC OpenSource Software Strategy  https://ec.europa.eu/info/departments/informatics/ope n-source-software-strategy_en  “Blue hats” community (French gov community)  https://www.modernisation.gouv.fr/le-hub-des- communautes/blue-hats  Annual FOSDEM conference  https://fosdem.org
  • 28.
  • 29.
    What is opendata ?  Free, easy to reuse, non-sensitive data  No personal data (GDPR !)  Download files or webservice / API  Who publishes open data ?  Governments (local /regional / federal)  Companies (voluntary basis)  Citizens / crowd-sourcing projects (voluntary basis)
  • 30.
    Legal framework  EUPublic Sector Information / Open Data Directive  https://ec.europa.eu/digital-single-market/en/european- legislation-reuse-public-sector-information  Transposed into federal law and regional decrees
  • 31.
    What kind ofdata ? Almost any !  A lot of statistics  Population, consumer price index, average income …  Transport and mobility  Train delays, parking spots, …  Environment  Air quality, biodiversity, …  Geospatial data  Satellite data, maps, …
  • 32.
    Formats  No singleformat for each and every type of data  Depends on application / systems / conventions  Preference for open, structured file formats  CSV, GTFS, (Geo)JSON, KML, ODS, sqlite, XLSX, XML …  (RDF / Linked Data)  Known / documented APIs  OGC GIS, OpenAPI/Swagger, SPARQL …  Availability and meaning often more important
  • 33.
  • 34.
    Quick tour oflicenses  Well-known licenses  Various Creative Commons, Open Data Commons  https://creativecommons.org/about/cclicenses/  https://opendatacommons.org/licenses/  Legal situation in Belgium  Status: it’s complicated (federated state)  Quite a few slightly different custom licenses …  … many of them are very similar to CC-licenses
  • 35.
    Differences and similarities Not all licenses can be combined !  What they have in common  No guarantees, non-exclusive, (mostly) irrevocable  Right to distribute  Important differences between licenses  Can be used commercially or not  Attribution required or not  License of combined datasets
  • 36.
  • 37.
    Additional services anddata  Combining and transforming data  Providing Service Level Agreements  Additional services  E.g. user-friendly search, visualizations, reports…  Examples  Law: https://lex.be , https://jura.kluwer.be  Companies: https://opencorporates.com
  • 38.
    Example: MapTiler withOpenStreetMap data
  • 39.
    Where to findopen data ?  Websites and (open data) portals  https://data.gov.be  If you can’t find it… just ask !  [email protected]  @BartHanssens  #data.gov.be:matrix.org
  • 40.
  • 41.
    Data.gov.be and theOpen Data Task Force  Open Data Task Force: FPS BOSA DG DT + DAV/ASA  Portal is “by-product” of full (meta)data export  See also https://github.com/fedict/dcat  “No Wrong Door” principle  Forward questions to Regions / Cities if needed
  • 42.
    Combining open sourceand open data
  • 43.
    Open Summer ofCode  Organized by OpenKnowledge Belgium  + partners from government and private sector  https://summerofcode.be/  Students get paid to build open source projects  Often projects which also use open data  Mostly IT, but also business, design, marketing, …  3 x 4 day in July  With coaches, presentations on soft skills, …
  • 44.
    Example: BeST Addresses Addresses + geo location from the 3 Regions  Published and updated weekly by BOSA DG DT  https://opendata.bosa.be/  2019 oSoC project  Python tools developed by students  Data also available on https://openaddresses.io  https://github.com/oSoc19/best
  • 45.
    Open data communities OpenKnowledge Belgium  Local “chapter” of OpenKnowledge  https://be.okfn.org  OSGeo Belgium  Open data and open source  https://www.osgeo.org/local-chapters/osgeo-belgium/  Open Justice  https://openjustice.be
  • 46.
  • 47.
    Data-driven projects  MozillaCommonVoice  https://commonvoice.mozilla.org/en/about  Telraam traffic counters  https://telraam.net/en  Global low-power data transfer / IoT  https://www.thethingsnetwork.org
  • 48.
    Share your opendata stories !  Are you working on a thesis / paper on open data ?  Did you create a stunning app / service ?  Are you organizing an open data-driven event ?  Let us know, so we can:  Tweet and make some fuzz  Promote your work  Convince more organizations to open data
  • 49.
  • 50.