String Aggregation Techniques(字符串汇总技术)
On occasion it is necessary to aggregate data from a number of rows into a
single row, giving a list of data associated with a specific value. Using the
SCOTT.EMP table as an example, we might want to retrieve a list of employees for
each department. Below is a list of the base data and the type of output we
would like to return from an aggregate query.
Base Data:
DEPTNO ENAME
---------- ----------
20 SMITH
30 ALLEN
30 WARD
20 JONES
30 MARTIN
30 BLAKE
10 CLARK
20 SCOTT
10 KING
30 TURNER
20 ADAMS
30 JAMES
20 FORD
10 MILLER
Desired Output:
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,FORD,ADAMS,SCOTT,JONES
30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARDThis article
is based on a thread from
and contains several methods to achieve the desired results.
WM_CONCAT Built-in FunctionIf you are running a version of the database
where the WM_CONCAT function is present, then it is a zero effort solution as it
performs the aggregation for you.
COLUMN employees FORMAT A50
SELECT deptno, wm_concat(ename) AS employees
FROM emp
GROUP BY deptno;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,FORD,ADAMS,SCOTT,JONES
30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD
3 rows selected.
Specific FunctionOne approach is to write a specific function to solve
the problems. The get_employees function listed below returns a list of
employees for the specified department.
CREATE OR REPLACE FUNCTION get_employees (p_deptno in emp.deptno%TYPE)
RETURN VARCHAR2
IS
l_text VARCHAR2(32767) := NULL;
BEGIN
FOR cur_rec IN (SELECT ename FROM emp WHERE deptno = p_deptno) LOOP
l_text := l_text || ',' || cur_rec.ename;
END LOOP;
RETURN LTRIM(l_text, ',');
END;
/
SHOW ERRORSThe function can then be incorporated into a query
as follows.
COLUMN employees FORMAT A50
SELECT deptno,
get_employees(deptno) AS employees
FROM emp
GROUP by deptno;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,JONES,SCOTT,ADAMS,FORD
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
3 rows selected.To reduce the number of calls to the
function, and thereby improve performance, we might want to filter the rows in
advance.
COLUMN employees FORMAT A50
SELECT e.deptno,
get_employees(e.deptno) AS employees
FROM (SELECT DISTINCT deptno
FROM emp) e;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,JONES,SCOTT,ADAMS,FORD
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
3 rows selected.
Generic Function using Ref CursorAn alternative approach is to write a
function to concatenate values passed using a ref cursor. This is essentially
the same as the previous example, except that the cursor is passed in making it
generic, as shown below.
CREATE OR REPLACE FUNCTION concatenate_list (p_cursor IN SYS_REFCURSOR)
RETURN VARCHAR2
IS
l_return VARCHAR2(32767);
l_temp VARCHAR2(32767);
BEGIN
LOOP
FETCH p_cursor
INTO l_temp;
EXIT WHEN p_cursor%NOTFOUND;
l_return := l_return || ',' || l_temp;
END LOOP;
RETURN LTRIM(l_return, ',');
END;
/
SHOW ERRORSThe CURSOR function is used to allow a query to be
passed to the function as a ref cursor, as shown below.
COLUMN employees FORMAT A50
SELECT e1.deptno,
concatenate_list(CURSOR(SELECT e2.ename FROM emp e2 WHERE e2.deptno = e1.deptno)) employees
FROM emp e1
GROUP BY e1.deptno;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,JONES,SCOTT,ADAMS,FORD
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
3 rows selected.Once again, the total number of function
calls can be reduced by filtering the distinct values, rather than calling the
function for each row.
COLUMN employees FORMAT A50
SELECT deptno,
concatenate_list(CURSOR(SELECT e2.ename FROM emp e2 WHERE e2.deptno = e1.deptno)) employees
FROM (SELECT DISTINCT deptno
FROM emp) e1;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,JONES,SCOTT,ADAMS,FORD
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
3 rows selected.
User-Defined Aggregate FunctionPossibly the best generic solution is to
create a user-defined aggregate function, using the ODCIAggregate interface, to
solve the problem, as shown below.
CREATE OR REPLACE TYPE t_string_agg AS OBJECT
(
g_string VARCHAR2(32767),
STATIC FUNCTION ODCIAggregateInitialize(sctx IN OUT t_string_agg)
RETURN NUMBER,
MEMBER FUNCTION ODCIAggregateIterate(self IN OUT t_string_agg,
value IN VARCHAR2 )
RETURN NUMBER,
MEMBER FUNCTION ODCIAggregateTerminate(self IN t_string_agg,
returnValue OUT VARCHAR2,
flags IN NUMBER)
RETURN NUMBER,
MEMBER FUNCTION ODCIAggregateMerge(self IN OUT t_string_agg,
ctx2 IN t_string_agg)
RETURN NUMBER
);
/
SHOW ERRORS
CREATE OR REPLACE TYPE BODY t_string_agg IS
STATIC FUNCTION ODCIAggregateInitialize(sctx IN OUT t_string_agg)
RETURN NUMBER IS
BEGIN
sctx := t_string_agg(NULL);
RETURN ODCIConst.Success;
END;
MEMBER FUNCTION ODCIAggregateIterate(self IN OUT t_string_agg,
value IN VARCHAR2 )
RETURN NUMBER IS
BEGIN
SELF.g_string := self.g_string || ',' || value;
RETURN ODCIConst.Success;
END;
MEMBER FUNCTION ODCIAggregateTerminate(self IN t_string_agg,
returnValue OUT VARCHAR2,
flags IN NUMBER)
RETURN NUMBER IS
BEGIN
returnValue := RTRIM(LTRIM(SELF.g_string, ','), ',');
RETURN ODCIConst.Success;
END;
MEMBER FUNCTION ODCIAggregateMerge(self IN OUT t_string_agg,
ctx2 IN t_string_agg)
RETURN NUMBER IS
BEGIN
SELF.g_string := SELF.g_string || ',' || ctx2.g_string;
RETURN ODCIConst.Success;
END;
END;
/
SHOW ERRORS
CREATE OR REPLACE FUNCTION string_agg (p_input VARCHAR2)
RETURN VARCHAR2
PARALLEL_ENABLE AGGREGATE USING t_string_agg;
/
SHOW ERRORSThe aggregate function is implemented using a type
and type body, and is used within a query.
COLUMN employees FORMAT A50
SELECT deptno, string_agg(ename) AS employees
FROM emp
GROUP BY deptno;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,FORD,ADAMS,SCOTT,JONES
30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD
3 rows selected.
ROW_NUMBER() and SYS_CONNECT_BY_PATH functions in Oracle 9iAn example
on williamrobertson.net
uses the ROW_NUMBER() and SYS_CONNECT_BY_PATH
functions to achieve the same result without the use of PL/SQL or additional
type definitions.
SELECT deptno,
LTRIM(MAX(SYS_CONNECT_BY_PATH(ename,','))
KEEP (DENSE_RANK LAST ORDER BY curr),',') AS employees
FROM (SELECT deptno,
ename,
ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY ename) AS curr,
ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY ename) -1 AS prev
FROM emp)
GROUP BY deptno
CONNECT BY prev = PRIOR curr AND deptno = PRIOR deptno
START WITH curr = 1;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 ADAMS,FORD,JONES,SCOTT,SMITH
30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
3 rows selected.
COLLECT function in Oracle 10gAn example on oracle-developer.net
uses the COLLECT function in Oracle 10g to get the same result.
This method requires a table type and a function to convert the contents of the
table type to a string. I've altered his method slightly to bring it in line
with this article.
CREATE OR REPLACE TYPE t_varchar2_tab AS TABLE OF VARCHAR2(4000);
/
CREATE OR REPLACE FUNCTION tab_to_string (p_varchar2_tab IN t_varchar2_tab,
p_delimiter IN VARCHAR2 DEFAULT ',') RETURN VARCHAR2 IS
l_string VARCHAR2(32767);
BEGIN
FOR i IN p_varchar2_tab.FIRST .. p_varchar2_tab.LAST LOOP
IF i != p_varchar2_tab.FIRST THEN
l_string := l_string || p_delimiter;
END IF;
l_string := l_string || p_varchar2_tab(i);
END LOOP;
RETURN l_string;
END tab_to_string;
/The query below shows the COLLECT function in
action.
COLUMN employees FORMAT A50
SELECT deptno,
tab_to_string(CAST(COLLECT(ename) AS t_varchar2_tab)) AS employees
FROM emp
GROUP BY deptno;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,JONES,SCOTT,ADAMS,FORD
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
3 rows selected.For more information see: