A Complementary Column Generation Approach for the Graph Equipartition Problem

. This paper investigates the problem of partitioning a complete weighted graph into complete subgraphs, each having the same number of vertices, with the objective of minimizing the sum of edge weights of the resulting subgraphs. This NP-complete problem arises in many applications such as assignment and scheduling-related group partitioning problems and micro-aggregation techniques. In this paper, we present a mathematical programming model and propose a complementary column generation approach to solve the resulting model. A dual based lower bounding feature is also introduced to curtail the notorious tailing-oﬀ eﬀects often induced when using column generation methods. Computational results are presented for a wide range of test problems.


Introduction and Motivation
In this paper, we study the problem of partitioning a complete weighted graph into complete subgraphs, each having the same number of vertices, with the objective of minimizing the total edge weights of the resulting subgraphs. This problem, denoted by GPP, is formally stated in Section 1.1 below, and Section 1.2 then presents some motivating examples.

Statement of Problem GPP
Consider a complete-weighted graph G(V , E), where V and E, respectively, denote the set of vertices and edges of the graph G. Let v = 1, 2, . . . , |V | index the vertices of V , and for v 1 , v 2 ∈ V with v 1 = v 2 , let (v 1 , v 2 ) ∈ E denote the edge joining v 1 and v 2 in G. Let w(v 1 , v 2 ) > 0 denote the weight associated with the edge (v 1 , v 2 ). Let nn be a positive integer and suppose that α = |V | n is integer-valued. Hence, the set V can be partitioned into n subsets, each of which is composed of α vertices. Let P denote the set of all distinct subsets of V , each of which has αα vertices, and let V p denote the pth such vertex subset, ∀p = 1, 2, . . . , |P |. An n-partition of V is a collection of n vertex subsets from P , say V p 1 , V p 2 , . . . , V p n , satisfying n i=1 V p i = V , and V p i ∩ V p j = ∅, ∀i, j ∈ {1, . . . , n} with i = j . Let Q denote the set of all such n-partitions, indexed by q = 1, . . . , |Q|, where the qth n-partition is given by {V p k (q) , k = 1, . . . , n}. For any V p , p ∈ P , let w p = v i ,v j ∈V p i<j w (v i , v j ), and accordingly, let c q ≡ n k=1 w p k (q) represent the cost of the qth n-partition for any q ∈ Q. Problem GPP then seeks an n-partition q * ∈ Q such that c q * c q , ∀q ∈ Q.

Motivating Examples
We provide two motivating examples for Problem GPP, where we have used specific numbers in lieu of generic notation for the purpose of illustration.
Example 1. Consider a firm that operates four work centres and needs to assign three employees to each centre (from a total of 12 available employees). For i = 1, . . . , 12, let e i denote the ith employee. Each employee quantifies ranked preferences for working with the other 11 employees from the set {1, . . . , 11}, where a lower number rank indicates a higher preference. We construct a complete weighted graph having 12 vertices, where vertex v i corresponds to employee e i , ∀ i = 1, . . . , 12, and where the weight associated with the edge joining vertices v i and v j , i, j ∈ {1, . . . , 12}, i = j , represents the sum of the preferences of employees e i and e j to work with each other. The problem of interest, then, is to partition the underlying graph into four complete subgraphs, each having three vertices, so that the total weight of the resulting complete subgraphs is minimal, thereby achieving a best aggregate preference.
Example 2. Consider a firm having 15 business branches that seeks to assign one of the available supervisors to each cluster of five branches. Since a supervisor assigned to any given cluster needs to frequently travel between the branches within the clusters, it is desired that the sum of the distances between the branches of a given cluster should be small. This problem can likewise be modelled as a complete graph partitioning problem having 15 vertices, each of which represents a branch, and with n = 3, where the edge weight associated with any pair of vertices is given by the distance between the corresponding branches.

Contribution and Organization
This paper proposes a column generation framework to solve Problem GPP with three enhancing features: (a) a complementary column generation scheme that uses a pricing problem to generate batches of columns; (b) a dual-based lower bound that can be employed to curtail the notorious tailing-off effects typically associated with column generation, and (c) the generation of a collection of vertex partitions that serves to determine a starting basis for the proposed column generation framework, as well as assists in computing good quality feasible solutions. The remainder of this paper is organized as follows. Section 2 presents literature related to the studied problem. In Section 3, we develop an integer mathematical programming model, denoted by GPM, for Problem GPP, which attempts to directly select a minimal cost n-partition. We then design an enhanced column generation approach (ECGH) in Section 4 to solve the linear relaxation of Model GPM, based on which we propose a heuristic procedure in Section 5 to solve Model GPM. Computational results are presented in Section 6, and we conclude the paper in Section 7 with a summary and some remarks, as well as future research extensions.

Related Literature
Several graph partitioning problems have been studied in the literature, which are motivated by applications in microaggregation (Domingo-Ferrer and Mateo-Sanz, 2002), political districting , video clustering (Schaeffer, 2007), telecommunication and VLSI design (Karypis et al., 1999), biological or social networks (Fan et al., 2009), and data mining (Zha et al., 2001). Typically such problems arise in the context of clustering, which is an unsupervised classification and the clusters must sometimes satisfy certain additional threshold criteria (Fan and Pardalos, 2012). The general graph partitioning problem aims to partition the vertex set of a graph into several disjoint subsets with the objective of minimizing the sum of edge weights between the disjoint subsets (Fan and Pardalos, 2010). This is an NP-complete combinatorial optimization problem (Garey et al., 1976) and different techniques were employed to solve it (Hager and Krylyuk, 1999). The case when the graph is partitioned into equal or different by 1 cardinalities for all partitions was solved either by linear programming (Lisser and Rendl, 2003) or semidefinite programming (Karisch and Rendl, 1998;Lisser and Rendl, 2003). Quadratic programming Krylyuk, 1999, 2002) and semidefinite programming (Wolkowicz and Zhao, 1996) requires that the cardinalities of all partitions are known a priori. Fan and Pardalos (2010) extended this work by formulating a zero-one quadrating programming problem without the input of cardinalities of the required partitions. The objective of the problem studied in the current paper is to minimize the sum of edge weights of the resulting partitions while that in the general graph partitioning problem is to minimize the sum of edge weights between the disjoint partitions.
In the following, we discuss a number of applications that are addressed using different types of graph partitioning paradigms. Micro-aggregation is a technique used by statistical agencies, where some statistical information needs to be disclosed, while the related specific individual information must remain classified. Published data needs to be therefore presented in a manner such that: (a) the classified data cannot be concluded from the published data, and (b) the deleted unclassified data is minimized. Domingo-Ferrer and Mateo-Sanz (2002) used a graph partitioning approach to solve this problem. Political redistricting is another application of graph partitioning, where boundaries of districts need to be drawn within the states to attain certain characteristics and to avoid partisan political goals.  designed a graph partitioning political redistricting model with the motivation that (a) differences in populations for any two different districts should be minimized in order to adhere to the one-person-one-vote principle; (b) districts should be contiguous, and (c) districts should be geographically compact. Graph partitioning is also used in video scene clustering (Tan and Lu, 2003) to index, browse, and retrieve video data. In this context, a graph G(V , E) is constructed as follows, where each vertex v ∈ V represents a scene and an edge e ∈ E between two vertices indicates the similarity obtained from some defined relations of the colours of two scenes. The objective is to partition G with the goal of maximizing similarity in the individual partitions where the number of partitions is not restricted.
In telecommunication technology, graph partitioning is employed to subdivide a transmission network into clusters in order to maximize the routed traffic within the clusters (Laguna, 1994). Park et al. (2000) addressed the problem of clustering a telecommunication network into local networks and hub locations. Xiao et al. (2007) developed a graph partitioning model to cluster mobile units within mobile servers. In this case, a graph G(V , E) is constructed where v ∈ V represents a mobile unit and each edge e ∈ E represents a communication link between two units, and where the weight assigned to the edge depends on some technical parameters including the bandwidth and the distance between the two vertices. Laguna (1994) used a graph partitioning model to enhance several design features and to overcome limitations of optical fiber networks within the telecommunication industry.
Graph partitioning has also been used to tackle scheduling problems. Carlson and Nemhauser (1966) developed a clustering model for a scheduling problem that involves several activities and facilities, where the problem is to cluster activities and then to assign them to the facilities so as to minimize interaction costs, given the cost of assigning pairs of activities to a facility. Salido et al. (2007) employed graph partitioning in railway scheduling to generate optimal schedules for trains, taking into consideration connection points, railway types, and train capacities, among other restrictions.
There exist several other such examples of graph partitioning problems that have been studied in the literature, e.g. see Ji (2004). These include the clique partitioning problem (Grotschel and Wakabayashi, 1989;1990), the graph equipartitioning problem Rao, 1990a, 1990b), the capacitated graph partitioning problem (Mehrotra and Trick, 1998), the maximum balanced connected q-partition (Chlebikova, 1996;Salgado and Wakabayashi, 2004), and the minimum edge-cut graph partitioning problem (Donath and Hoffman, 1973;Goldschmidt and Hochbaum, 1988), among others. All of the above graph partitioning problems are NP-hard (Ji, 2004). In particular, similar to the problem considered in the present paper, the k-way graph equipartitioning problem is to partition the vertex set V into k subsets of equal size, with the objective of minimizing the total weight of edges that have both end-points in the same partitioned subset. Mitchell (2001) formulated a mathematical model for the k-way graph equipartitioning problem, investigated its polyhedral structure and presented a branch-and-cut algorithm to solve the resulting model. The algorithm in Mitchell (2001) was used to realign the National Football League (NFL). Results for partitioning 32 teams into eight groups with the objective of minimizing the overall travel time among teams within each group were reported in Mitchell (2003). The sports realignment problem studied in Mitchell (2001Mitchell ( , 2003 was revisited later along with other similar contextual problems by Xiaoyun and Mitchell (2005) who modelled the realignment of NBA, NHL, and NFL as k-way equipartition problems. A branch-and-price scheme with cutting plane features was designed and implemented to solve the resulting k-way equipartitioning problems, where the pricing problem was modelled as an integer program. Computational results indicated that the branch-and-priceand-cut scheme of Xiaoyun and Mitchell (2005) performed well on small-sized instances (with about 40 vertices). However, for larger test instances, the solution of the pricing problem turned out to be cumbersome to solve. Nonetheless, for such problems, the root algorithm was found to yield relatively good quality feasible solutions. The algorithm in Xiaoyun and Mitchell (2005) was also used to solve certain micro aggregation problems. The authors concluded that the performance of their proposed branch-and-price-and-cut approach was comparable to that of the price-and-cut method of Mitchell (2001Mitchell ( , 2003.

Formulation of Model GPM
In this section, we formulate a model for Problem GPP, denoted GPM, which directly attempts to select a minimum-cost collection of n valid partitions from the set P in order to constitute an n-partition.

Model formulation
Define the following set of binary decision variables: For a given partition p ∈ P , we define the following set of parameters that indicate whether a vertex v ∈ V belongs to the associated vertex subset V p or not: Note that the values of the parameters λ v,p are known a priori based on information derived from the corresponding subset V p . Then, the following model determines a minimum-cost n-partition: The objective function of GPM minimizes the overall weight of edges associated with the selected n-partition. Constraint (3.1) assures that each v ∈ V belongs to exactly one valid partition. The required number of valid partitions (n) is enforced by Constraint (3.2). The continuous relaxation of Model GPM, denoted by GPM, is given as follows: Minimize p∈P w p x p : (3.1) and (3.2), where x p 0, ∀p ∈ P .
Note that x p 1 is implied by (3.1). The following structural result indicates that Constraint (3.2) can be deleted from the above model without affecting the solution, even in the continuous sense.
Notwithstanding Proposition 1, we retain Constraint (3.2) in the model because of the lower bounding facility it provides for GPM, which enables a useful practical stopping criterion when solving the latter problem (see Proposition 2 below).
Note that Model GPM attempts to directly select a minimal cost n-partition, i.e. a minimal cost collection of n valid partitions from the set P . An alternative modelling approach to solve Problem GPP is to designate decision variables that assign vertices to different subsets and to designate constraints to ensure the cardinality of each subsets. This modelling approach is the subject of a follow-on paper, where we will attempt to employ a Lagrangean-based decomposition scheme in concert with symmetry defeating strategies to solve Problem GPP. As it will be seen later, the formulation of Model GPM enabled us to devise a column generation algorithm to heuristically solve Problem GPP, which is the focus of the current paper.

An Enhanced Column Generation Approach to Solve Model GPM
In this section, we exploit the special column structure of GPM in order to solve its continuous LP relaxation GPM via a column generation procedure (e.g. see Barnhart et al., 1998), along with three enhancing features as discussed below. Suppose that at some iteration of the revised simplex method for solving GPM, we have a basic feasible solution. Let {ξ ≡ (ξ v , v ∈ V ), ξ 0 } denote the corresponding complementary dual solution, where ξ and ξ 0 are the dual variables associated with Constraints (3.1) and (3.2), respectively. We can then find a candidate entering nonbasic variable x p that has the smallest (most negative) reduced cost by solving the following auxiliary subproblem, where π v equals one if vertex v ∈ V is selected for inclusion within V p and is zero otherwise: Note that the resulting vector ππ ≡ (π v , v ∈ V ) corresponds to a valid partition, say p ∈ P , where λ v,p = π v , ∀v ∈ V , and where the first term of the objective function represents w p for the generated entering column. To solve Problem SP, each product relationship π v i π v j that appears in the objective function can be linearized by substituting a continuous variable γ v i ,v j instead, while incorporating the following constraints, noting that π v i and π v j are required to be binary-valued and that the w-parameters are positive (see Sherali and Warren, 1998): (4.1) Hence, letting τ * (M) denote the optimal objective function value of any model M, if Mτ * (SP) 0, then no nonbasic variable is a candidate to enter the basis, and an optimal solution to Problem GPM is at hand. Otherwise, if τ * (AP lb ) < 0, we will have obtained a candidate entering variable x p for GPM from the optimal solution obtained for SP as noted above, and we then introduce this column into the basis and reiterate.

Enhancing Features
Next, we discuss three enhancing features that can improve the solvability of Problem GPM by mitigating the tailing-off effect that is often induced by the classical column generation approach.

A) Duality based lower bounding termination criterion
The following proposition, whose proof readily follows from Proposition 1 in Ghoniem and Sherali (2009) (we include a specialized proof below for the sake of completeness), portends an optimality gap via the solution of Problem SP that will enable us to conveniently terminate the solution of Problem GPM within some percentage of optimality.
Proposition 2. At any iteration of the column generation process to solve Problem GPM, the solution to Problem SP provides a dual feasible solution to GPM with a duality gap of −nτ * (SP) 0.
Proof. Let (ξ, ξ 0 ) be the complementary dual solution to the restricted version of GPM at any iteration, where this restricted problem provides an upper bound of v∈V ξ v + nξ 0 (4.2) on the value τ * (GPM). Moreover, from the corresponding problem SP, we get or that (ξ, ξ 0 + τ * (SP)) is dual feasible to GPM, thus establishing a lower bound on the value τ * (GPM), with a dual objective function value of From (4.2) and (4.3), we therefore infer that this dual feasible solution yields a duality gap of −nτ * (SP) 0.

B) Generation of complementary columns
Instead of generating a single negative reduced-cost column as done above in the classical column generation (CG), the complementary column generation (CCG) of Ghoniem and Sherali (2009) advocates the generation of multiple columns at each iteration to form a feasible n-partition (as possible, unless an infeasible subproblem is encountered) as described next. Let π = {π v , v ∈ V } be a solution to Problem SP, based on which, let = {v : π v = 1}. Let V be initialized as a set that contains the partition that includes the vertices in , and let X be initialized as a set that contains the variable in GPM2 corresponding to the partition in V . We then resolve Problem SP with the additional requirements that π v = 0, ∀v ∈ . Let π new = {π v new , v ∈ V } denote the resulting solution.
Next, the set of prohibited indices is augmented by setting ← ∪{v : π v new = 1} and the sets V and X are updated accordingly. The foregoing step is repeated until = |V | or an infeasible subproblem is encountered. The variables in X along with their respective partitions from V will serve to augment a restricted version of Model GPM that will be used in the next subsection within a column generation framework to solve GPM. Note that when = |V |, the set V consists of a batch of columns that collectively constitute a feasible solution to Model GPM. Moreover, even if an infeasible subproblem is encountered in the foregoing process, the set of partitions generated thus far can be used to fruitfully augment the current restricted version of Model GPM.

C) Determining a starting basis for Model GPM
Note that Model GPM is highly degenerate because it has |V | + 1 rows, but any basic feasible solution corresponding to a feasible binary solution involves only |V | α = n nonzero binary x-variables. This will likely exacerbate the initial oscillations of dual solutions within the column generation procedure, which typically slows the convergence of the algorithm. Various dual stabilization approaches have been discussed in the literature to mitigate this phenomenon, see, for example, Bazaraa et al. (2010). Instead of using such dual stabilization techniques, we simply try to diminish the occurrence of oscillations by the generation of additional columns (as discussed next) in order to restrict the dual solution space, which thereby essentially contributes toward the dual stabilization process.
With this motivation, we determine 3n + |V | partitions of V as follows, which can then be used along with suitable artificial variables (at zero values) to determine a starting basis for GPM: . . , V n } constitutes a valid n-partition. b) For v = 1, . . . , |V |, we construct |V | partitions, denoted by V n+v , as follows. Consider the subgraph of G that contains only the least weight (α − 1) edges incident to v. Let V n+v be the set of vertices that contains v as well as the other (α − 1) vertices adjacent to v via the selected (α − 1) edges. Note that the vertex partitions V n+1 , . . . , V n+|V | are not necessarily mutually exclusive. v)) and proceed in this fashion until V n+|V |+1 contains α vertices. Pick v 1 ∈ V /V n+|V |+1 and let V n+|V |+2 be the set of vertices from V /V n+|V |+1 that contains v 1 along with (α − 1) vertices chosen as discussed in the process of constructing V n+|V |+1 . Continue in this manner until n vertex partitions are constructed, which collectively constitute an n-partition. d) Determine the least cost vertex partition by solving Problem SP with a modified objective function given by The resulting solution determines a valid partition; denote this V 2n+|V |+1 . Next, we resolve Problem SP with the aforementioned modified objective function while updating the set of vertices to exclude all the vertices in V 2n+|V |+1 . We proceed in this manner until we construct n vertex partitions that collectively constitute a valid n-partition.
Hereafter, we will refer to the foregoing three enhancing complementary column generation features discussed in this section as CCG features.

Enhanced Column Generation Algorithm to Solve GPM
An enhanced column generation algorithm that incorporates the above three features, denoted by ECGA, is presented next to solve GPM. The best known lower and upper bounds derived in this process for solving GPM as described below are denoted by LB * and UB * , respectively.

Initialization Step
Let X 0 be the set of x-variables associated with the vertex partitions V i , i = 1, . . . , 3n + |V |, as determined above, along with suitable artificial variables incorporated within the constraints of Model GPM (to determine a starting basis). Consider a restricted version of GPM, denoted by GPM 0 , which contains the variables in X 0 . Solve GPM 0 directly using CPLEX, to determine an initial basic feasible solution to Model GPM. If the set of basic variables contains any artificial variables at optimality, then iteratively solve Problem SP to generate columns for GPM until the residual artificial variables are eliminated. Set i = 1 and let X 1 be the set of (non-artificial) variables in the current column generation master program. Select a suitable optimality tolerance ε 1 , set LB * = −∞, UB * = ∞, i = 1 and proceed to the Main Step.

Main Step
Construct a restricted version of GPM, denoted by GPM i , which only involves the variables in X i and solve GPM i . Let {ξ, ξ 0 } denote the corresponding complementary dual solution for GPM i . Set UB * = τ * (GPM i ) = { v∈V ξ v + nξ 0 }. Solve the subproblem SP, and let π * be the optimal solution obtained. Set LB * = max{LB * , v∈V ξ v + n[ξ 0 + τ * (SP)]}. a) If τ * (SP) 0, stop; the optimal solution obtained for GPM i GPM i is also optimal for GPM. b) Trigger the CCG feature to determine X . Set X i+1 = X i ∪ X and let i ← i + 1. If (100 UB * −LB * UB * ) ε 1 , stop; we have an optimal solution for GPM within an optimality tolerance of ε 1 %. Otherwise, repeat the Main Step.

Analysis of Algorithm ECGH
a) Algorithm ECGA establishes an upper bound UB * on GPM that decreases monotonically at each iteration of ECGA. Also, at each iteration of ECGA, the lower bound LB * for GPM is updated based on the solution to Problem SP. At the end of Algorithm ECGA, the best provable lower bound on Model GPM is given by b) When the CCG Features is triggered at a given iteration of Algorithm ECGA, the set X consists of a batch of columns that collectively lend themselves toward composing a feasible solution to Model GPM. The process of generating complementary columns judiciously includes multiple sets of feasible solutions whose composition is likely to enhance the possibility of encompassing optimal or near-optimal solutions when solving the last restricted problem of GPM as a binary restricted problem as used in the procedure described in Section 5 below. Moreover, the additional columns generated provide a further restriction on the dual search space, which induces dual stabilization. c) The duality-based lower bound established in Proposition 2 offers a helpful ε 1 -based termination criterion that serves to circumvent the notorious tailing-off trend often associated with column generation procedures. Hence, both of the above features, along with the generation of 3n + |V | columns to initialize Algorithm ECGA, are instrumental in designing an effective heuristic approach for solving GPM with a provable optimality tolerance at termination as described in the next section. d) Note that at any iteration of ECGA columns that are already (explicitly) present in the restricted master program (GPM i ) price out with nonnegative reduced costs, and therefore these columns are automatically excluded from the solution to Problem SP (except at termination when τ * (SP) 0). However, the inclusion of constraints in Problem SP that explicitly exclude all such columns provides valid cuts that might serve to tighten the continuous relaxation of this problem and, hence, enhance its solvability. For this purpose, letting V ⊆ V be the set of vertices that characterize the partition corresponding to any column that is currently in GPM i , we add the following constraint to Problem SP for each such column: v∈V π v (α − 1).
(4.4) e) Although, the formulation of Model GPM incorporates all potential partitions as represented by the x p variables, the initial step of the proposed column generation heuristic (ECGH) contains only a small subset of the x p variables, then more valid partitions (x p variables) are added iteratively until a heuristic solution is attained. In fact, this is the main advantage of adopting a column generation framework, where initially only a subset of the columns are present in the solution, and more columns are added subsequently until a heuristic solution is obtained.

A Heuristic Procedure for Solving GPM
In this section, we present a heuristic approach to solve Model GPM, denoted by ECGH, which is a sequential variable-fixing procedure that constructs a feasible n-partition in a sequential fashion in order to solve Problem GPP. Essentially, this procedure generates an n-partition by augmenting fixed partitions from solutions to GPM obtained via the ECGA method outlined in the foregoing section.
To describe this procedure, cconsider an optimal solution to GPM obtained via ECGA. Let S 1 b be the index set of the basic variables that equal one when ECGA terminates, and let S f b be the set of fractional basic variables at optimality. (Note that if S f b = ∅, we have an n-partition at hand, and we stop with this solution as optimal for GPM.) Initialize the set = S 1 b . Hence, V p i ∩ V p j = ∅, ∀ p i , p j ∈ with i = j , because otherwise, if there exists some v ∈ V p i ∩ V p j , then equation (3.1) corresponding to vertex v would be violated.
Note that | | n, or else, we would have an n-partition, where n > n, which involves αn vertices from V , contradicting the fact that |V | = αn.
The following Variable-Fixing Step will be used with our proposed enhanced column generation heuristic described subsequently.

Variable-Fixing
Step Let x be the optimal solution obtained for GPM and let x max = max p∈S f b {x p }, and determine = {p ∈ S f b : x p = x max }, w min = min{w p : p ∈ }, and = {p ∈ : w p = w min }. Pickp ∈ and update ← ∪ {p}. Let = {p ∈ P : V p ∩ V p 1 = ∅, ∀p 1 ∈ }, and correspondingly, define V = {v ∈ V : v / ∈ p∈ V p }. Based on the above variable-fixing step, we let GPM to be a modified version of GPM obtained by: (a) restricting the set of partitions P to , and (b) replacing V by V in (3.1). This problem is given as follows: We can now solve GPM using ECGA as before, based on the remnant set of vertices, and repeat the process until an n-partition is obtained. The proposed heuristic (ECGH) for generating an n-partition is stated formally below, noting that whenever | | = n, we have at hand an n-partition that is described by the set of valid partitions {p ∈ }.

LP-Step
Solve GPM using ECGA, and let X denote the resulting solution. Determine the index sets

Variable-Fixing
Step (This step is described above).

Final Step
Let ← ∪ {p}. If | | = n; otherwise, update and V , and repeat the LP-Step. Remark 3. a) At each iteration of ECGH, the set is augmented by at least one valid partition in a feasible fashion, and the number of elements in cannot exceed n. Consequently, the algorithm terminates finitely whenever | | = n, yielding a desired n-partition. b) In the "Variable-Fixing Step", note that Vp ∩ V p = ∅Vp ∩ V p , ∀p ∈ , because otherwise, (3.1) would be violated for some v ∈ V . Hence, we maintain a partitioning of the vertices while augmenting the set according to ← ∪ {p}.

Computational Results
In this section, we present computational results for the proposed complementary column generation approach for solving Model GPM. We use a set of 24 test problems described in Table 1 for GPM, where TP i , i = 1, . . . , 24, represents the i-th test problem. For all the test instances, the weights associated with the edges are randomly generated using a random function within the interval [1, 1000]. All computational results have been performed on a Core™ i7 Processor, CPU 4.00 GHz computer having 4 GB of RAM and using the CPLEX Commercial Package (version 12) as the optimization solver. Below, we summarize the notation that will be used in this section.
• T ECGA : The total solution time for solving GPM using Algorithm ECGA.
• T ECGH : The total solution time for solving GPM using Heuristic ECGH.
We begin by presenting computational results pertaining to solving Model GPM using Heuristic ECGH based on a set of 24 test problems having up to 100 vertices with different partitioning requirements. Tables 2 and 3, respectively, present our computational experience in solving Model GPM with and without the CCG Features. Note that because the weights associated with Problem GPP are randomly generated, repeated implementations of Heuristic ECGH might produce different objective function values, optimality gaps and run-times for a given test problem. Also, observe that in Proposition 2, we have assumed that Problem SP is solved using ε 2 = 0, in which case we have the provable lower bound given either by τ * (GPM) if τ * (SP) 0 or otherwise by LB * . However, when using a tolerance ε 2 > 0 while solving Problem SP, we can modify Proposition 2 to assert that the established duality gap is given by −nτ * (SP) + nε 2 0, with LB * ← LB * − nε 2 . The use of this lower bounding scheme is instrumental in curtailing the tailing-off effect associated with column generation. Hence, for each test problem, we first set ε 1 = ε 2 = 0 and try to solve Model GPM within some specified time. In case a solution is not obtainable, we then set ε 2 = 0 and set ε 1 to some sufficiently small value and gradually increment it until we reach a solution within the specified time. For test instances that remain unsolved, we set ε 2 to a sufficiently small positive value and increment it until we obtain a solution within the specified time. From our preliminary computational experiments, we noticed that solving Problem SP even with the default cutting plane feature of CPLEX was cumbersome in some test instances, especially those having a relatively large number of vertices, while the time to update the solution to GPM was in most cases just a few seconds. In order to better understand the efficiency of the CCG Features and its contribution toward enhancing the solution quality for Model GPM, we experimented with solving this model using Heuristic ECGH both with and without the CCG Features. As shown in Tables 2 and 3, the incorporation of the CCG Features reduced the average solution time for solving Model GPM by 11-fold and for solving Model GPM by 12-fold. This substantial reduction in the average solution times is attained by virtue of mitigating the tailing-off effect at termination. The average optimality gap at termination when solving Model GPM with and without the enhancing features are given by 3.08% and 4.22%, respectively. Although the higher average optimality gap obtained for the latter is skewed because of the relatively high optimality gap of 57.89% obtained when solving problem TP 1 (without this test case, the average optimality gap is given by 1.88%), but still the significant reduction in the average solution time when using the CCG Features strengthens the robustness of the proposed column generation approach.
In the remainder of this section, we focus and analyse results obtained using the CCG Features. To provide insights into the performance of our approach for solving Model GPM, we partition our test problems into three sets, denoted by S 1 , S 2 and S 3 , based on the number of vertices ranging up to 36, 84, and 100, respectively. Tables 4 and 5 provide results for these partitioned subsets of problems. As expected, with an increase in the number of vertices, both the total run-time for solving Model GPM using ECGH and the resulting optimality gap at termination increased. Test problems from the set S 1 (with n 36) were solvable without using any termination tolerances for solving either GPM or Problem SP, and in this case the least, largest, and average optimality gaps obtained were given by 0%, 5.88%, and 1.35%, respectively. For test problems from S 2 (having n 84), we solved Model GPM using a range of incrementally increasing gap tolerances. In this case, the least, largest, and average optimality gaps obtained were given by 0.7%, 6.96%, and 3.45%, respectively. For test problems in S 3 (having n 100), using the aforementioned modified lower bounding result, the least, largest, and average optimality gaps obtained were given by 5.72%, 12.22%, and 8.35%, respectively.

Summary, Conclusions and Future Research
This paper examines a graph partitioning problem that is concerned with the partitioning of a complete weighted graph G(V , E) into n complete subgraphs each having the same number of α vertices, with the objective of minimizing the total weight of edges included in the subgraphs. This problem has many applications in various contexts such as assignment-related group partitioning problems, micro aggregation in statistics, telecommunication, and political redistricting. To solve this problem, we formulated a mixedinteger program, denoted by GPM, which directly attempts to select a minimum-cost collection of n valid partitions from the entire set of valid partitions in order to constitute an n-partition. Exploiting the structure of Model GPM, we then designed a column generation heuristic (ECGH) that incorporates the following three enhancing features for solving this model: (a) a lower bounding facility based on solving the pricing subproblem, which helps to curtail the tailing off effect typically associated with column generation; (b) a complementary column generation feature that attempts to generate multiple columns at each iteration to constitute a feasible n-partition, and (c) the generation of initial columns for Model GPM that serve to provide a starting basis as well as to restrict the dual solution space, thereby contributing toward dual stabilization. Detailed computational results were presented for solving Model GPM. These results demonstrated that the CCG Features proposed for enhancing the traditional column generation framework yielded comparable quality solutions (3% optimality on average) with respect to the standard classical column generation approach while reducing the average run-time for solving Models GPM and GPM by 11-fold and 12-fold, respectively. Based on our computational results, we propose investigating further algorithmic strategies for dealing with relatively larger problems. In particular, solving Problem SP within algorithm ECGH was especially cumbersome in test instances of Problem GPP that involve a relatively large number of vertices. In fact, we were able to obtain reasonable solutions for up to 100 vertices for Model GPM, but for problems having 150 vertices, we were unable to solve even the linear programming relaxation of Problem SP after two days of run-time. Hence, we recommend exploring some alternative ways for solving Problem SP, including a polyhedral analysis coupled with more effective heuristic solution approaches.
Another extension worth exploring for solving Model GPM is as follows. Let GPM LR be the current restricted version of Model GPM obtained from the final iteration of Algorithm ECGH. We can then solve GPM LR to optimality as a 0-1 program directly using a commercial package such as CPLEX by utilizing some suitable specialized decomposition scheme as necessary, in order to obtain a good quality feasible solution to Model GPM. This might be particularly attractive because of the complementary column generation strategy implemented within the solution process (Ghoniem and Sherali, 2009). Moreover, this solution approach can further provide facility to consider equity issues within the vertex partitioning scheme through the addition of suitable side-constraints, which are difficult otherwise to accommodate within the column generation modelling and solution process. In the future, we also aim to explore alternative modelling approaches for Problem GPP that attempt to directly generate the required partitions and use decomposition schemes such as Lagrangian relaxation while incorporating necessary equity.

S.M. Al-Yakoob is an associate professor at the Department of Mathematics, Kuwait
University. His research interests include mathematical programming and optimization with applications to real world problems such as location, transportation, scheduling, and timetabling problems.
H.D. Sherali is a University Distinguished Professor Emeritus at the Industrial and Systems Engineering Department, Virginia Polytechnic Institute and State University. His areas of research interest are in mathematical optimization modelling, analysis, and design of algorithms for specially structured linear, nonlinear, and continuous and discrete nonconvex programs, with applications to transportation, location, engineering and network design, production, economics, and energy systems. He has published over 351 refereed articles in various operations research journals and has (co-)authored nine books, with a total Google Scholar citation count of over 35,700 and an H-index of 75. He is an elected member of the National Academy of Engineering, a fellow of both INFORMS and IIE, and a member of the Virginia Academy of Science Engineering and Medicine.