Azure Gremlin edge traversal suspiciously high (Out() step) RU cost

Author: oPolo

I have a weird issue, where doing an out-operation on a few edges causes my RU cost to triple. Hope someone can help me shed light on why + what I can do to mitigate it.

I have a Graph in CosmosDB, where there are two types of vertex labels: "Profile" and "Score". Each profile has 0 or 1 score-vertices via a "ProfileHasAggregatedScore" edge. The partitionKey is the ID of the Profile.

If I make the following queries, the RU currently is:

g.V().hasLabel('Profile').out('ProfileHasAggregatedScore')
>78 RU (8 scores found)

And for reference, the cost of getting all vertices of a type is:

g.V().hasLabel('Profile')
>28 RU (110 profiles found)

g.E().hasLabel('ProfileHasAggregatedScore')
>11 RU (8 edges found)

g.V().hasLabel('AggregatedRating')
>11 RU (8 scores found)

And the cost of a single of the vertices or edges are:

g.V('aProfileId').hasLabel('Profile')
>4 RU (1 found)

g.E('anEdgeId')
> 7RU

G.V('aRatingId')
> 3.5 RU

Can someone please help me as to why, making a traversal with only a few vertices along the way (see traversal at the bottom), is more expensive than searching for everything? And is there something I can do to prevent it? Adding a has-filter with the partitionKey does not seem to help. It seems odd that traversing/finding 16 elements more (8 edges and 8 vertices) after finding 110 vertices triples the cost of the operation?

(NB. With 1000 profiles the cost of doing 1 traversal along an edge to the score node is 2200 RU. This seems high, considering the emphasis their Azure team put on it being scalable?)

Traversal if it can help (It seems most of the time is spent finding the edges with the out() step):

[
  {
    "gremlin": "g.V().hasLabel('Profile').out('ProfileHasAggregatedScore').executionProfile()",
    "totalTime": 46,
    "metrics": [
      {
        "name": "GetVertices",
        "time": 13,
        "annotations": {
          "percentTime": 28.26
        },
        "counts": {
          "resultCount": 110
        },
        "storeOps": [
          {
            "fanoutFactor": 1,
            "count": 110,
            "size": 124649,
            "time": 2.47
          }
        ]
      },
      {
        "name": "GetEdges",
        "time": 26,
        "annotations": {
          "percentTime": 56.52
        },
        "counts": {
          "resultCount": 8
        },
        "storeOps": [
          {
            "fanoutFactor": 1,
            "count": 8,
            "size": 5200,
            "time": 6.22
          },
          {
            "fanoutFactor": 1,
            "count": 0,
            "size": 49,
            "time": 0.88
          }
        ]
      },
      {
        "name": "GetNeighborVertices",
        "time": 7,
        "annotations": {
          "percentTime": 15.22
        },
        "counts": {
          "resultCount": 8
        },
        "storeOps": [
          {
            "fanoutFactor": 1,
            "count": 8,
            "size": 6303,
            "time": 1.18
          }
        ]
      },
      {
        "name": "ProjectOperator",
        "time": 0,
        "annotations": {
          "percentTime": 0
        },
        "counts": {
          "resultCount": 8
        }
      }
    ]
  }
]
enter code here

Originally Sourced from: https://stackoverflow.com/questions/57459690/azure-gremlin-edge-traversal-suspiciously-high-out-step-ru-cost