Skip to content

Refactor to allow containers to be directly used as spaces #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Aug 5, 2022
115 changes: 91 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,70 +5,137 @@
[![Code Style: Blue](https://img.shields.io/badge/code%20style-blue-4495d1.svg)](https://github.com/invenia/BlueStyle)
[![PkgEval](https://JuliaCI.github.io/NanosoldierReports/pkgeval_badges/C/CommonRLSpaces.svg)](https://JuliaCI.github.io/NanosoldierReports/pkgeval_badges/report.html)

## Introduction

A space is simply a set of objects. In a reinforcement learning context, spaces define the sets of possible states, actions, and observations.

In Julia, spaces can be represented by a variety of objects. For instance, a small discrete action set might be represented with `["up", "left", "down", "right"]`, or an interval of real numbers might be represented with an object from the [`IntervalSets`](https://github.com/JuliaMath/IntervalSets.jl) package. In general, the space defined by any Julia object is the set of objects `x` for which `x in space` returns `true`.

In addition to establishing the definition above, this package provides three useful tools:

1. Traits to communicate about the properties of spaces, e.g. whether they are continuous or discrete, how many subspaces they have, and how to interact with them.
2. Functions such as `product` for constructing more complex spaces
3. Constructors to for spaces whose elements are arrays, such as `ArraySpace` and `Box`.

## Concepts and Interface

### Interface for all spaces

Since a space is simply a set of objects, a wide variety of common Julia types including `Vector`, `Set`, `Tuple`, and `Dict`<sup>1</sup>can represent a space.
Because of this inclusive definition, there is a very minimal interface that all spaces are expected to implement. Specifically, it consists of
- `in(x, space)`, which tests whether `x` is a member of the set `space` (this can also be called with the `x in space` syntax).
- `rand(space)`, which returns a valid member of the set<sup>2</sup>.
- `eltype(space)`, which returns the type of the elements in the space.

In addition, the `SpaceStyle` trait is always defined. Calling `SpaceStyle(space)` will return either a `FiniteSpaceStyle`, `ContinuousSpaceStyle`, `HybridSpaceStyle`, or an `UnknownSpaceStyle` object.

### Finite discrete spaces

Spaces with a finite number of elements have `FiniteSpaceStyle`. These spaces are guaranteed to be iterable, implementing Julia's [iteration interface](https://docs.julialang.org/en/v1/manual/interfaces/). In particular `collect(space)` will return all elements in an array.

### Continuous spaces

Continuous spaces represent sets that have an uncountable number of elements they have a `SpaceStyle` of type `ContinuousSpaceStyle`. CommonRLSpaces does not adopt a rigorous mathematical definition of a continuous set, but, roughly, elements in the interior of a continuous space have other elements very close to them.

Continuous spaces have some additional interface functions:

- `bounds(space)` returns upper and lower bounds in a tuple. For example, if `space` is a unit circle, `bounds(space)` will return `([-1.0, -1.0], [1.0, 1.0])`. This allows agents to choose policies that appropriately cover the space e.g. a normal distribution with a mean of `mean(bounds(space))` and a standard deviation of half the distance between the bounds.
- `clamp(x, space)` returns an element of `space` that is near `x`. i.e. if `space` is a unit circle, `clamp([2.0, 0.0], space)` might return `[1.0, 0.0]`. This allows for a convenient way for an agent to find a valid action if they sample actions from a distribution that doesn't match the space exactly (e.g. a normal distribution).
- `clamp!(x, space)`, similar to `clamp`, but clamps `x` in place.

### Hybrid spaces

The interface for hybrid continuous-discrete spaces is currently planned, but not yet defined. If the space style is not `FiniteSpaceStyle` or `ContinuousSpaceStyle`, it is `UnknownSpaceStyle`.

### Spaces of arrays

[need to figure this out, but I think `elsize(space)` should return the size of the arrays in the space]

### Cartesian products of spaces

The Cartesian product of two spaces `a` and `b` can be constructed with `c = product(a, b)`.

The exact form of the resulting space is unspecified and should be considered an implementation detail. The only guarantees are (1) that there will be one unique element of `c` for every combination of one object from `a` and one object from `b` and (2) that the resulting space conforms to the interface above.

The `TupleSpaceProduct` constructor provides a specialized Cartesian product where each element is a tuple, i.e. `TupleSpaceProduct(a, b)` has elements of type `Tuple{eltype(a), eltype(b)}`.

---

<sup>1</sup>Note: the elements of a space represented by a `Dict` are key-value `Pair`s.
<sup>2</sup>[TODO: should we make any guarantees about whether `rand(space)` is drawn from a uniform distribution?]

## Usage

### Construction

|Category|Style|Example|
|:---|:----|:-----|
|Enumerable discrete space| `DiscreteSpaceStyle{()}()` | `Space((:cat, :dog))`, `Space(0:1)`, `Space(1:2)`, `Space(Bool)`|
|Multi-dimensional discrete space| `DiscreteSpaceStyle{(3,4)}()` | `Space((:cat, :dog), 3, 4)`, `Space(0:1, 3, 4)`, `Space(1:2, 3, 4)`, `Space(Bool, 3, 4)`|
|Multi-dimensional variable discrete space| `DiscreteSpaceStyle{(2,)}()` | `Space(SVector((:cat, :dog), (:litchi, :longan, :mango))`, `Space([-1:1, (false, true)])`|
|Continuous space| `ContinuousSpaceStyle{()}()` | `Space(-1.2..3.3)`, `Space(Float32)`|
|Multi-dimensional continuous space| `ContinuousSpaceStyle{(3,4)}()` | `Space(-1.2..3.3, 3, 4)`, `Space(Float32, 3, 4)`|
|Enumerable discrete space| `FiniteSpaceStyle{()}()` | `(:cat, :dog)`, `0:1`, `["a","b","c"]` |
|One dimensional continuous space| `ContinuousSpaceStyle{()}()` | `-1.2..3.3`, `Interval(1.0, 2.0)` |
|Multi-dimensional discrete space| `FiniteSpaceStyle{(3,4)}()` | `ArraySpace((:cat, :dog), 3, 4)`, `ArraySpace(0:1, 3, 4)`, `ArraySpace(1:2, 3, 4)`, `ArraySpace(Bool, 3, 4)`|
|Multi-dimensional variable discrete space| `FiniteSpaceStyle{(2,)}()` | `product((:cat, :dog), (:litchi, :longan, :mango))`, `product(-1:1, (false, true))`|
|Multi-dimensional continuous space| `ContinuousSpaceStyle{(2,)}()` or `ContinuousSpaceStyle{(3,4)}()` | `Box([-1.0, -2.0], [2.0, 4.0])`, `product(-1.2..3.3, -4.6..5.0)`, `ArraySpace(-1.2..3.3, 3, 4)`, `ArraySpace(Float32, 3, 4)` |
|Multi-dimensional hybrid space| `HybridSpaceStyle{(2,),()}()` | `product(-1.2..3.3, -4.6..5.0, [:cat, :dog])`, `product(Box([-1.0, -2.0], [2.0, 4.0]), [1,2,3])`|

### API

```julia
julia> using CommonRLSpaces

julia> s = Space((:litchi, :longan, :mango))
Space{Tuple{Symbol, Symbol, Symbol}}((:litchi, :longan, :mango))
julia> s = (:litchi, :longan, :mango)

julia> rand(s)
:litchi

julia> rand(s) in s
true

julia> size(s)
()
julia> length(s)
3
```

```julia
julia> s = Space(UInt8, 2,3)
Space{Matrix{UnitRange{UInt8}}}(UnitRange{UInt8}[0x00:0xff 0x00:0xff 0x00:0xff; 0x00:0xff 0x00:0xff 0x00:0xff])
julia> s = ArraySpace(1:5, 2,3)
CommonRLSpaces.RepeatedSpace{UnitRange{Int64}, Tuple{Int64, Int64}}(1:5, (2, 3))

julia> rand(s)
2×3 Matrix{UInt8}:
0x7b 0x38 0xf3
0x6a 0xe1 0x28
2×3 Matrix{Int64}:
4 1 1
3 2 2

julia> rand(s) in s
true

julia> SpaceStyle(s)
DiscreteSpaceStyle{(2, 3)}()
FiniteSpaceStyle()

julia> size(s)
julia> elsize(s)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may also have eltype implemented

(2, 3)
```

```julia
julia> s = Space(SVector(-1..1, 0..1))
Space{SVector{2, ClosedInterval{Int64}}}(ClosedInterval{Int64}[-1..1, 0..1])
julia> s = product(-1..1, 0..1)
Box{StaticArraysCore.SVector{2, Float64}}([-1.0, 0.0], [1.0, 1.0])

julia> rand(s)
2-element SVector{2, Float64} with indices SOneTo(2):
0.5563101538643473
0.9227368869418011
2-element StaticArraysCore.SVector{2, Float64} with indices SOneTo(2):
0.03049072910834738
0.6295234114874269

julia> rand(s) in s
true

julia> SpaceStyle(s)
ContinuousSpaceStyle{(2,)}()
ContinuousSpaceStyle()

julia> size(s)
julia> elsize(s)
(2,)
```

julia> bounds(s)
([-1.0, 0.0], [1.0, 1.0])

julia> clamp([5, 5], s)
2-element StaticArraysCore.SizedVector{2, Float64, Vector{Float64}} with indices SOneTo(2):
1.0
1.0
```
29 changes: 26 additions & 3 deletions src/CommonRLSpaces.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,34 @@ module CommonRLSpaces

using Reexport

@reexport using FillArrays
@reexport using IntervalSets
@reexport using StaticArrays
@reexport import Base: OneTo

using StaticArrays
using FillArrays
using Random
import Base: clamp

export
SpaceStyle,
AbstractSpaceStyle,
FiniteSpaceStyle,
ContinuousSpaceStyle,
UnknownSpaceStyle,
bounds,
elsize

include("basic.jl")

export
Box,
ArraySpace

include("array.jl")

export
product,
TupleProduct

include("product.jl")

end
81 changes: 81 additions & 0 deletions src/array.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
abstract type AbstractArraySpace end
# Maybe AbstractArraySpace should have an eltype parameter so that you could call convert(AbstractArraySpace{Float32}, space)

"""
Box(lower, upper)

A Box represents a space of real-valued arrays bounded element-wise above by `upper` and below by `lower`, e.g. `Box([-1, -2], [3, 4]` represents the two-dimensional vector space that is the Cartesian product of the two closed sets: ``[-1, 3] \\times [-2, 4]``.

The elements of a Box are always `AbstractArray`s with `AbstractFloat` elements. `Box`es always have `ContinuousSpaceStyle`, and products of `Box`es with `Box`es or `ClosedInterval`s are `Box`es when the dimensions are compatible.
"""
struct Box{A<:AbstractArray{<:AbstractFloat}} <: AbstractArraySpace
lower::A
upper::A

Box{A}(lower, upper) where {A<:AbstractArray} = new(lower, upper)
end

function Box(lower, upper; convert_to_static::Bool=false)
@assert size(lower) == size(upper)
sz = size(lower)
continuous_lower = convert(AbstractArray{float(eltype(lower))}, lower)
continuous_upper = convert(AbstractArray{float(eltype(upper))}, upper)
if convert_to_static
final_lower = SArray{Tuple{sz...}}(continuous_lower)
final_upper = SArray{Tuple{sz...}}(continuous_upper)
else
final_lower, final_upper = promote(continuous_lower, continuous_upper)
end
return Box{typeof(final_lower)}(final_lower, final_upper)
end

# By default, convert builtin arrays to static
Box(lower::Array, upper::Array) = Box(lower, upper; convert_to_static=true)

SpaceStyle(::Box) = ContinuousSpaceStyle()

function Base.rand(rng::AbstractRNG, sp::Random.SamplerTrivial{<:Box})
box = sp[]
return box.lower + rand_similar(rng, box.lower) .* (box.upper-box.lower)
end

rand_similar(rng::AbstractRNG, a::StaticArray) = rand(rng, typeof(a))
rand_similar(rng::AbstractRNG, a::AbstractArray) = rand(rng, eltype(a), size(a)...)

Base.in(x::AbstractArray, b::Box) = all(b.lower .<= x .<= b.upper)

Base.eltype(::Box{A}) where A = A
elsize(b::Box) = size(b.lower)

bounds(b::Box) = (b.lower, b.upper)
Base.clamp(x::AbstractArray, b::Box) = clamp.(x, b.lower, b.upper)

Base.convert(t::Type{<:Box}, i::ClosedInterval) = t(SA[minimum(i)], SA[maximum(i)])

struct RepeatedSpace{B, S<:Tuple} <: AbstractArraySpace
base_space::B
elsize::S
end

"""
ArraySpace(base_space, size...)

Create a space of Arrays with shape `size`, where each element of the array is drawn from `base_space`.
"""
ArraySpace(base_space, size...) = RepeatedSpace(base_space, size)

SpaceStyle(s::RepeatedSpace) = SpaceStyle(s.base_space)

Base.rand(rng::AbstractRNG, sp::Random.SamplerTrivial{<:RepeatedSpace}) = rand(rng, sp[].base_space, sp[].elsize...)

Base.in(x::AbstractArray, s::RepeatedSpace) = all(entry in s.base_space for entry in x)
Base.eltype(s::RepeatedSpace) = AbstractArray{eltype(s.base_space), length(s.elsize)}
Base.eltype(s::RepeatedSpace{<:AbstractInterval}) = AbstractArray{Random.gentype(s.base_space), length(s.elsize)}
elsize(s::RepeatedSpace) = s.elsize

function bounds(s::RepeatedSpace)
bs = bounds(s.base_space)
return (Fill(first(bs), s.elsize...), Fill(last(bs), s.elsize...))
end

Base.clamp(x::AbstractArray, s::RepeatedSpace) = map(entry -> clamp(entry, s.base_space), x)
Loading