How to execute multiple aggregate Expressions in the same query?
  • Hi Karem,

    I'm trying  to calculate the sum of two different columns for a set of  documents in one query, is there anyway to do it in Apstrata?! I also want  to save the sum result in the response as a key-value pair to be able to differentiate between the sum results ...
    I tried these params:
    array_push($params, new KeyValue("apsdb.aggregateGroupBy", 'campaignName<string>'));
    array_push($params, new KeyValue("apsdb.aggregateGlobal", 'true'));
    array_push($params, new KeyValue("apsdb.aggregatePage", 'false'));
    array_push($params, new KeyValue("apsdb.aggregateExpression", 'SUM($impressions),SUM($totalClicks)' ));

    The result was which is very wired behavior, where the value of the SUM($totalClicks) was assigned to the campaign name!!!!!!!!
    where the  SUM($impressions) is the one before the fields array ...  

    [aggregate] => Array
    (
    [globalScope] => Array
    (
    [groupby] => Array
    (
    [0] => Array
    (
    [value] => 147932.0
    [fields] => Array
    (
    [0] => Array
    (
    [name] => campaignName
    [value] => 74871.0
    )

    )

    )

    [1] => Array
    (
    [value] => 31261.0
    [fields] => Array
    (
    [0] => Array
    (
    [name] => campaignName
    [value] => 16730.0
    )

    )

    )

    [2] => Array
    (
    [value] => 15906.0
    [fields] => Array
    (
    [0] => Array
    (
    [name] => campaignName
    [value] => 8001.0
    )

    )

    )

    )

    )

    )

    Could you please advice!!

    Thanks

  • Hello,

    We do not currently support multiple aggregate expressions in one query. This should be supported soon though.

    For the moment, I suggest you create a server side script to run two queries, as follows. This should resolve your problem for now (kindly replace field names with yours).

    var queryParams = {
    "apsdb.store": "DefaultStore",
    "apsdb.query": "clicks<numeric> is not null and impressions<numeric> is not null",
    "apsdb.queryFields": "campaign, clicks, impressions", 
    "apsdb.aggregateExpression": "SUM($clicks)",
    "apsdb.aggregatePage":"false",
    "apsdb.aggregateGlobal":"true",
    "apsdb.aggregateGroupBy": "campaign<string>"
    }

    var query1Results = apsdb.callApi("Query", queryParams, null);
    var sumOfClicksGroupedByCampaign = query1Results.result.aggregate.globalScope.groupby;

    queryParams["apsdb.aggregateExpression"] = "SUM($impressions)";
    var query2Results = apsdb.callApi("Query", queryParams, null);
    var sumOfImpressionsGroupedByCampaign = query2Results.result.aggregate.globalScope.groupby;

    var parsedQuery1 = parseAggregate("clicks", sumOfClicksGroupedByCampaign);
    var parsedQuery2 = parseAggregate("impressions", sumOfImpressionsGroupedByCampaign);
    return mergeAggregates(parsedQuery1.concat(parsedQuery2));

    /*
     * Generic function that parses the result of af aggregate expression + groupby
     * and simplifies it into the following structure
     * [
     *  ...
     *  {"grouped-by-field-value-i": {"field-name":"sum-value"}}, 
     *  ...
     * ]
     * Example:
     * [{"campaign1": {"clicks":"42"}}, {"campaign2": {"clicks":"500"}}]
     */
    function parseAggregate(fieldName, results) {
     
    var count = [];
    for (var i = 0; i < results.length; i++) {
    var fields =  Object.keys(results[i])[1];
    var name = results[i][fields][0]["value"];
    var obj = {};
    var value = {};
    value[fieldName] = results[i].value;
    obj[name] = value;
    count.push(obj);
    }
    return count;
    }

    /*
     * Merges an array of aggregates built from the results of "parseAggregate" by group-by field
     * returns the following format 
     * [
    {
    "groupby-field-n": {"summedfield1":sum1, ..., "summedfield2":sum2}, 
    }
     * ]
     * Example:
     * [
     * {
     *    "campaign1": { "clicks": "42.0", "impressions": "30.0"}
     *    
     * }, 
     * {
     *     "campaign2": {"clicks": "15.0", "impressions": "5.0"}
     * }
     * ] 
     */
    function mergeAggregates(aggregates) {
    var mergedNames = [];
    var merged = [];
    for (var i = 0; i < aggregates.length; i++) {
    var key = Object.keys(aggregates[i])[0];
    var index = mergedNames.indexOf(key);
    var obj = aggregates[i][key];
    var x = {};
    if (index == -1) {
    mergedNames.push(key);
    index = merged.length;
    x[key] = obj;
    }else {
    x = merged[index];
    for (var attrname in obj) { x[key][attrname] = obj[attrname];}
    }
    merged[index] = x;
    }
    return merged;
    }
  • Hi Karim,

    Thanks a lot for your respond, in fact I found i way to do aggregates together with grouping but, the column names of the group  by fields becomes null!!! Te way I found can be used to get all the aggregates  in one call, and another call needed to get the  group  by fields  names ..., here is the way to do it :

    array_push($params, new KeyValue("apsdb.aggregateGroupBy", 'aaaa1Impressions<numeric>, aaaa2totalClicks<numeric>, aaaa3CTR<numeric>, campaignName<string>, flightName<numeric>'));
                    
     array_push($params, new KeyValue("apsdb.aggregateExpression",  'SUM($totalClicks), SUM($impressions), SUM($impressions)/SUM($totalClicks)*0.01' ));

    I found that the group by fields names have to be ordered alphabetical ;). I know that I can not rely on this since any change in the Apstrata back-end might change this behavior.

    Thanks 





  • Indeed, 

    I did not suggest this solution as it is a side effect of the current support for aggregate expressions :)
    As long as you are OK with it that's fine with me ;)

    Karim
  • Hi Karim,

    It's
    been very long time ago. Hope you are doing well and having good time. 
    I wonder if there is any updated in Apstarta backend.  Is multiple aggregates feature is supported by Apstrata now, or not yet ?!
    In this thread you shared a script to use since multiple aggregates was not supported at that time.  At the sam time I found a way to do the multiple aggregates, I explained it in the thread too.
    My question is: Is there a big difference in terms of performance between the way you suggested and the way I used ?
    In
    other words, let us say I have 100 aggregates over a  1000 document,
    There will be a big difference in terms of performance between the two
    ways?
    One more question: When I'm sending more than 32 params to aggregate I'm getting this error:
    metadata] => Array
                                    (
                                        [status] => failure
                                        [statusCode] => 500
                                        [errorCode] => INTERNAL_ERROR
                                        [errorDetail] => Error executing view query
                                    )


    Could you please advice!

    Thank you in advance
    Mo
  • Hello,

    I'm sorry for the late reply.

    Answering your questions: 
    • We did not add the support of multiple aggregates to Apstrata yet, 
    • I guess your option would be faster than running a script at some point (if you're dealing with large volumes of data). Note that, as mentioned in my answer to your post, your solution is not really "standard" so it my be affected by changes, as you also noted it yourself,
    • There is a limit to the number of parameters you can aggregate, which explains the error. We should work on a better error message.
    Best,
    Karim

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!